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Foreword 


The Handbook of Mathematical Logic is an attempt to share with the 
entire mathematical community some modern developments in logic. We 
have selected from the wealth of topics available some of those which deal 
with the basic concerns of the subject, or are particularly important for 
applications to other parts of mathematics, or both. 

Mathematical logic is traditionally divided into four parts: model theory, 
set theory, recursion theory and proof theory. We have followed this 
division, for lack of a better one, in arranging this book. It made the 
placement of chapters where there is interaction of several parts of logic a 
difficult matter, so the division should be taken with a grain of salt. Each of 
the four parts begins with a short guide to the chapters that follow. The first 
chapter or two in each part are introductory in scope. More advanced 
chapters follow, as do chapters on applied or applicable parts of mathemat- 
ical logic. Each chapter is definitely written for someone who is 
not a specialist in the field in question. On the other hand, each chapter has 
its own intended audience which varies from chapter to chapter. In 
particular, there are some chapters which are not written for the general 
mathematician, but rather are aimed at logicians in one field by logicians in 
another. 

We hope that many mathematicians will pick up this book out of idle 
curiosity and leaf through it to get a feeling for what is going on in another 
part of mathematics. It is hard to imagine a mathematician who could 
spend ten minutes doing this without wanting to pursue a few chapters, and 
the introductory sections of others, in some detail. It is an opportunity that 
hasn’t existed before and is the reason for the Handbook. 
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This part of the Handbook is concerned with the fundamental relation- 
ship between mathematical statements (axioms), on the one hand, and 
mathematical structures (models) which satisfy them, on the other. The 
emphasis is on the model theory of first-order statements. Barwise’s 
chapter, written for those with no prior knowledge of first-order logic, 
explains the most basic notions. This material is really pre-model theory 
and is needed for most of the chapters in the Handbook. 

Keisler’s chapter contains the real introduction to model theory. By 
glancing through this chapter the reader can see the concerns of the subject 
illustrated with basic results and applications. The next three chapters treat 
important topics from model theory in depth and are aimed more at the 
algebraist. 


4 GUIDE TO PARTA 


Eklof’s chapter discusses the ultraproduct operation, its relation with 
first-order logic, and its positive applications to algebra. Macintyre’s 
chapter discusses both positive and negative applications to algebra of 
Abraham Robinson’s notion of model complete theory and related concepts 
of ‘algebraically closed”’. 

Morley’s chapter on homogenous sets discusses so-called 
Ehrenfeucht-Mostowski models. This construction has proven extremely 
useful in model theory and in applications to set theory. It has had some 
applications to other parts of mathematics, but should have more once it 
becomes better known. 

To date the principal application of model theory outside algebra and set 
theory comes from Robinson’s ‘“‘nonstandard analysis’’. Stroyan’s chapter 
discusses elementary aspects of the subject and gives a more advanced case 
study of the hidden role infinitesimals play in differential geometry. 

The last three chapters in Part A go beyond ordinary first-order logic. 
Some extensions of first-order logic are mentioned in the last section of 
Barwise’s chapter and discussed in more detail in the last section of 
Keisler’s chapter. Of all the known extensions, the logic L.,. has the 
smoothest model theory. This logic, and its admissible fragments, are 
discussed in Makkai’s chapter. 

The final chapter, by Kock and Reyes, is quite different in character. It 
gives the category theoretical point of view of some topics from model 
theory and other parts of logic. 

It was planned to have a chapter on stability theory and one on abstract 
model theory. This proved impossible so stability theory is now surveyed in 
Section 8 of Keisler’s chapter. Abstract model theory is discussed at the 
end of Barwise’s chapter and is touched on in Keisler’s chapter. Among the 
other chapters of the Handbook which are particularly relevant to model 
theory are Rabin’s chapter on decidable and undecidable theories, and 
Aczel’s chapter on inductive definitions, both in Part C of the book. 
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1. Foreword 


This introductory chapter is written for the (less than ideal) mathemati- 
cian who knows next to nothing about mathematical logic, and is entirely 
expository in nature. This might be someone who tried to read a later 
chapter but got bogged down simply because he did not understand the 
basic notions. For most readers a quick reading of Section 2 and the 
introductions to Sections 4 and 5 should suffice. 

Modern mathematics might be described as the science of abstract 
objects, be they real numbers, functions, surfaces, algebraic structures or 
whatever. Mathematical logic adds a new dimension to this science by 
paying attention to the language used in mathematics, to the ways abstract 
objects are defined, and to the laws of logic which govern us as we reason 
about these objects. The logician undertakes this study with the hope of 
understanding the phenomena of mathematical experience and eventually 
contributing to mathematics, both in terms of important results that arise 
out of the subject itself (Gédel’s Second Incompleteness Theorem is the 
most famous example) and in terms of applications to other branches of 
mathematics. The chapters of this Handbook are intended to illustrate 
both of these aspects of mathematical logic. 

Modern mathematical logic has its origins in the dream of Leibniz of a 
universal symbolic calculus which could encompass all mental activity of a 
logically rigourous nature, in particular, all of mathematics. This vision was 
too grandiose for Leibniz to realize. His writings on the subject were 
largely forgotten and had little influence on the actual course of events. It 
took Boole, Frege, Peano, Russell and Whitehead, Hilbert, Skolem, Godel, 
Tarski and their followers, armed with more powerful abstract methods, 
and motivated (at least in the case of Russell and Hilbert) by apparent 
problems in the foundations of mathematics, to realize a significant part of 
Leibniz’ dream. 


2. How to tell if you are in the realm of first-order logic 


Our goal in this section is quite modest: to give the reader, by means of 
examples, a feeling for what can and what cannot be expressed in 
first-order logic. Most of our examples are taken from the wealth of notions 
in modern algebra with which most mathematicians have at least a nodding 
acquaintance. 

The basic building blocks of first-order logic consist of the logical 
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connectives: a (and), v (or), — (not), — (implies), the equality symbol =, 
quantifiers V (for all), 3 (there exists) plus an infinite sequence of variables 
xX, y,Z,%1,yi,-.. and some parentheses ), ( to help the formulas stay 
readable. 

In addition to these logical symbols, a set L of primitive non-logical 
symbols is given by the topic under discussion. For example, if we are 
working with abelian groups then the set L has a function symbol + for 
group addition and a constant symbol 0 for the zero element. If we are 
working with orderings, then L has a relation symbol <. For the study of 
set theory, L has a relation symbol € . We will postpone the rather tedious 
formal definition of formula of first-order logic until the mext section. Here 
we stress only that formulas are certain finite strings of symbols. 

The “‘first” in the phrase ‘“‘first-order logic” is there to distinguish this 
form of logic from stronger logics (like second-order or weak second-order 
logic) where certain extralogical notions (like set or natural number) are 
taken as given in advance. In particular, in first-order logic the quantifiers V 
and 3 always range over elements of the domain M of discourse. By 
contrast, second-order logic allows one to quantify over subsets of M and 
functions F mapping, say, M x M into M. (Third-order logic goes on to 
sets of functions, etc.) Weak second-order logic allows quantification over 
finite subsets of M and over natural numbers. There are good reasons for 
considering first-order logic to be the basic language of mathematics; these 
will be discussed in Section 5. We assume here that the reader has his own 
motivation for wanting to find out what first-order logic is. 


Group theory 


Our first few examples come from group theory. Consider the following 
notions: 

(a) group, 

(b) abelian group, 

(c) abelian group with every element of order <n, 

(d) divisible group, 

(e) torsion-free group, 

(f) torsion group. 
The notions (a}-(c) are easily axiomatized by a few first-order axioms. 
Notions (d) and (e) take an infinite list of axioms. The last notion (f) is not 
first-order. Let’s see why. 

A group G isa triple G = (G, + ,0) (where G is a nonempty set, 0€ G 
and + is a function mapping G x G into G) which satisfies the following 
first-order axioms, or sentences: 
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Vx Vy Vz[x+(ytz)=(x+y)+2Z], (1) 
Vx [x +0=x], (2) 
Vx dy[x+y=0]. (3) 


The logician might say that G is a model of (1), (2), (3) and write GF (1), 
(2), (3), instead of saying that G satisfies (1), (2), (3). 
An abelian group is a group G satisfying the axiom 


VxVy[xt+y=ytx]. (4) 


The choice of the symbol “+ ” in (1}(4) is dictated by convention only; it 
has no real significance. 

To express the next notion we abbreviate the formal term (x + x) by 2x, 
the term ((x + x)+ x) by 3x and, by induction, we abbreviate the term 
(nx + x) by (n + 1)x. An abelian group G has every element of order <n 
if G is a model of 


Vx [x =Ov2x =0v---vnx =O). (5) 


This is a simple first-order sentence. 
An abelian group G is divisible if 


Vn =1Vx Ay [ny = x]. (6) 


This would count as a sentence of weak second-order logic but it is not a 
first-order axiom because the leading quantifier ranges over the set of 
positive natural numbers, rather than over the domain of discourse G. We 
can, however, replace this expression by the following infinite list of 
axioms: 


Vx dy [2y =x], (6)2 
Vx dy [3y = x], (6)s 


Vx dy ty =x], (6), 


(We left off (6), since it is the trivial sentence Vx Jy [x = y].) For most 
purposes such an effectively presented infinite list of axioms is practically as 
good as a finite list. Still, it is worth proving for our own satisfaction that it 
is not just lack of imagination which forces us to use an infinite list to 
express the notion. 


2.1. PRoposiITiON. Any finite set of first-order sentences true in all divisible 
abelian groups is true in some nondivisible abelian group. 
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In other words, the notion of divisible abelian group is not finitely 
axiomatizable in first-order logic. We delay the proof of this result for a few 
paragraphs. 

We discover essentially the same phenomenon when we attempt to 
axiomatize the concept of torsion-free abelian group: 


Vn =1Vx [x40 nx 0]. (7) 


This sentence of weak second-order logic turns into an infinite list of 


first-order axioms: 
Vx [x ¥ 0- nx ¥ OJ. (7)n 


We have the corresponding negative result. 


2.2. PRoposiTION. The notion of torsion-free abelian group is not finitely 
axiomatizable in first-order logic. 


An abelian group G is torsion if it satisfies 


Vx dn =1[nx = 0]. (8) 


This is a sentence of weak second-order logic but it is not first-order 
because it has the quantifier dn over natural numbers. We could try to 
imitate (5) but look what happens: 


Vx [x =O0v2x =Ov---vnx =O0v:--]. (8) 


This sort of expression is analogous to an infinite formal power series and 
the study of such idealized “‘infinitary formulas” has turned out to be quite 
profitable (see 5.3, and Chapters A.2 and A.7) but it is not part of ordinary 
first-order logic. To clinch matters we will prove the following result. 


2.3. PRoposiTION. The set of first-order sentences true in all torsion abelian 
groups is true in some abelian group H which is not torsion. 


In fact, what we will show is that if G is an abelian group with no finite 
bound on the order of its elements, then there is a group H which is not 
torsion but such that G = H, which means that every first-order sentence 
true in G is also true in H, and vice versa. Therefore the class of torsion 
groups cannot be characterized even by a set of first-order axioms — finite 
or infinite. 
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Nonaxiomatizability results 


There are two standard tools for proving nonaxiomatizability results. 
They are corollaries of the Completeness Theorem and will be proved in 
Section 4. We will use these tools to prove all the results of this section. 


2.4. COMPACTNESS THEOREM (Gédel—Malcev). Let T be any set of first-order 
axioms. If for every finite subset T, of T there is a model of all the axioms in 
To, then there is a single model of all the axioms in T. 


An alternate form of the Compactness Theorem is sometimes more 
convenient. Let us write TF & to indicate that is a logical consequence of 
T in the sense that w& is true in all models which make all the axioms of T 
true. Then the Compactness Theorem is equivalent to the statement: If 
T U{} is a set of first order sentences and TF y, then there is a finite T) C T 
such that T,Fw. To see that this follows from 2.4, apply to 2.4 to 
T U{1}, where —w asserts that w& is false. To prove 2.4 from this 
version, let % be some absurd yw like Ax (x4# x). The Compactness 
Theorem fails for second-order logic or even weak second-order logic, as 
the proof of 2.1 will show. 

The other property of first-order logic sometimes used to prove nonax- 
iomatizability results is the following L6wenheim-Skolem Theorem. This 
important principle also holds for weak second-order logic but not for 
second-order logic. 


2.5. LOWENHEIM—SKOLEM THEOREM. Let x be an infinite cardinal and let T 
be a set of at most x first-order axioms. If there is a model making all the 
axioms in T true, then there is such a model whose set of elements has 
cardinality =k. 


Remark. As long as the ‘set L of nonlogical symbols is finite, or even 
countable, as has been the case up to now, there can be only a countable set 
of first-order formulas, since every formula is a finite string. Thus, for such 
L, every set T of axioms which has a model has a countable model, by 2.5. 


ProorF oF 2.1. Let {w,..., & } be a finite set of first-order sentences true in 
all divisible abelian groups and let & be the conjunction (WA --:- 4%). Our 
task is to prove that # is true in some nondivisible abelian group. We apply 
the second version of the Compactness Theorem. Let T be the set of 
axioms (1}-(4) plus all the axioms (6),. Thus T is a set of axioms for 
divisible abelian groups. The hypothesis is that TE #. By the Compactness 
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Theorem there is a finite Ty) C T such that T,.- &. This means that there is 
an N such that y is true in all abelian groups which satisfy Vx Jy [ny = x] 
for n =2,...,N. (This much of the proof is common to many proofs.) 
Taking the first example which comes to mind, let Z, be the group of 
integers mod p, for some prime p>N. The group Z, is a model of 
Vx dy [ny = x] for n <p, since the map which sends x to nx is one-one 
and hence onto. Thus yw is true in Z,. But Z, is far from being divisible since 
px =0 for allxEZ, O 


The proof of 2.2 is just like the proof of 2.1 in form so is left to the 
reader. 


ProoF oF 2.3. Let G = (G, + ,0) be any (possibly torsion) group such that, 
for each n, there is an element x, of G of order = n. For example, G might 
be the direct sum of all Z, over all primes p. We will prove that there is a 
nontorsion group H such that G and H satisfy-exactly the same first-order 
sentences. Again we use the Compactness Theorem, this time the first 
version. Take a new constant symbol c and let T consist of all sentences 
(not mentioning c) true in the group (G, +,0) plus all the sentences: 
2c# 0, 3c4 0, 4c 4 0, etc. Thus T is a set of sentences in a language which 
has a name c for a new distinguished element. If H = (H, + ,0, c) satisfies 
all the axioms in T then (H, + ,0) will be a group with the same first order 
axioms true as are true in G but H will not be torsion since the 
distinguished element c will have infinite order. All we need to see is that 
there is an H which is a model of all of T. By the Compactness Theorem, it 
suffices to find a model of each finite T, € T. But this is easy. Given Th, let 
N be bigger than all n such that the sentence nc # 0 is in Ty. Then we can 
use xn to make Ty true. That is, Ty is true in the group (G, + ,0, xx) with 
distinguished element xy, since the order of xy is =N. O 


The real numbers 


Our first set of examples had to do with whole classes of structures. We 
now turn to one specific structure, the ordered field R = (R, + ,-, <,0,1) of 
real numbers. Most students of advanced calculus suffer through a con- 
struction of R and a proof that certain axioms characterize R up to 
isomorphism. The axioms are not first-order, however. 


2.6. Proposition. There is no first-order set of axioms which characterize R 
up to isomorphism. 
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Proor. By the remark following 2.5, the set of all sentences true in R is 
countable. By the L6wenheim-Skolem Theorem, this set has a countable 
model. 0 


Since the L6wenheim-Skolem Theorem also holds for weak second- 
order logic, the proof of 2.6 shows that there is a countable field with the 
same weak second-order properties as R, among which is the Archimedean 


axiom: 
Vx dn[x <n], 


where n1 is the term ((1+ 1)+---+ 1), n-times, as before. This is a weak 
second-order statement since the quantifier Jn ranges not over the 
elements of an arbitrary model but over the real natural numbers. 

The proof of 2.6 is misleading because it makes one feel that the problem 
has to do with the fact that there are undefinable real numbers, since there 
are more reals than there are possible definitions in first-order logic with 
countably many symbols. We can correct this impression by proving a 
similar result for the enriched structure (R, +,°, <,1),er where every real 
number r is treated as a distinguished element and is given a name (i.e. 
constant symbol). We continue the (slightly confusing) practice of using an 
object r for its own name. 


2.7. PROPOSITION. There is a non- Archimedean field *R extending R which 
satisfies ail the first-order sentences true in R, even if we allow names for all 
real numbers. 


Proor. The proof is similar, but actually simpler than, the proof of 2.3. We 
take another new constant symbol c and write the sentence 


c>r 


for all real numbers r. To these sentences we add all true first-order 
sentences of R. By the Compactness Theorem this set of sentences has a 
model *R. We can consider R as a submodel of *R. Since the field axioms 
are true in R they are also true in *R. 0 


Most of the theorems of calculus are first-order so that they will hold in 
*R. Thus 2.5, far from being a negative result, is actually the basis of 
analysis by means of infinitesimals, or, in other words, Robinson’s 
“nonstandard” analysis. (The element 1/c will be a positive infinitesimal.) 
Thus, it is only a mild exaggeration to say that the universal symbolic 
calculus of Leibniz’ imagination eventually led to a justification of his use 
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of infinitesimals in the calculus. For more on this, see Chapter A.6 and its 
1.1 in particular. 

Proposition 2.7 leaves us with the question: Which of the usual axioms 
for the real numbers is not first-order? The answer is: the completeness 
axiom, 

VX CR{if X 4 9@ is bounded, then X has a ].u.b.]. 


This is not first-order because the universal quantifier ranges over the set of 
all subsets X of R. Thus, the proof that the real numbers are unique is 
really relative to a universe of set theory. 


Rings and fields 

The completeness property of the field of real numbers is not first- 
order, as we have seen. Let us conclude this introduction into first-order 
properties by seeing some of the properties of rings and fields that are first 
order. In this discussion our basic language (or vocabulary) L has the 
nonlogical symbols + ,-,0,1. The basic axioms for commutative rings with 
identity consist of (1}(4) above (the abelian Broup axioms) plus the 
following first-order axioms: 


WxVy[x-y=y-x], 
Vx Wy Wz [(x-y)-z=x-(y*z)], 
Vx Vy Wz[x-(y+z)=(x-y)+(x°z)], 
Vx [x-1= x], 
0A 1. 
A ring ® =(R, +,-,0,1) is an integral domain if it is a model of 
Vx Vy[x-y =0—-(x =Ovy =O). 


Before going on to fields, let us pause to see what to do about a prime 
concern in ring theory, the notion of an ideal. A proper ideal of a 
commutative ring i = (R, +,-,0,1) is simply a nonempty, proper subset 
ICR which is a subgroup of under addition such that for all x © R and 
all yE I, x-y EL. To express this in first-order logic we add a name for I 
and consider structures of the form (MN, I) =a(R, +,°,0,1, 7). Then J is an 
ideal of if (9, J) is a model of the following three axioms. To keep set 
theory out of the picture, we think of I as a 1-place relation and write I(x) 
rather than x € I. The middle two axioms assert that I is a subgroup under 
+; the last asserts that I is closed under multiplication by any element x 
of ®. 
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1(0)a7 I(1), 
Vx Wy [I(x)aI(y)> I(x + y)], 
Vx Vy [I(x)axt+y =0->1(y)], 
Vx Vy [I(y)> I -y)I. 
An ideal I is a prime ideal if (R, I) is a model of 
Wx Vy [I(x + y)> M(x) v 1(y)]. 


Up till now everything has been simple. Either the natural definition of a 
notion was first-order, or else we have been able to show that the notion is 
not first-order. This is not always the case. Indeed, some of the most useful 
applications of logical tools (like ultraproducts) hinge on finding some 
first-order equivalent to a notion that doesn’t look first-order. This is often 
a nontrivial matter but we give only a simple example. Others can be found 
in Chapters A.3 and A.4. 

An ideal I is a maximal ideal of Xi if (M, I) is a model of 


VJ [IT CJaJ an ideal > J =I or J= R]. (9) 


This is the same form of second-order sentence as the completeness axiom 
for the reals, but this time we can find an equivalent first-order axiom by 
recalling the lemma which says that J is maximal in X iff the quotient ring 
%/I is a field. To say that t/J is a field is to say that for all x, if x + J is not 
the coset 0+ J, then there is a y such that (x +J)(y+J)=1+1. Since 
(x +1I)-(y +I) =x+y+/TJ, we can express (9) by the axiom 


Vx [TI(x)> dy (xy +1=14+3D)] 
which, when written out in detail becomes 
Vx [TI (x)—> Sy az (I (z)axy +z = 1). (9) 


While (9)' looses the intuitive content of (9), it is equivalent to (9) and it is 
first-order, which is what matters here. 

Here is a good exercise. A ring Mis a principal ideal ring if Xi is a model 
of the second-order sentence 


VI[I an ideal > 5x Vy (I(y) @ 3z (y = zx))]. 


This has the same general form as (9) but it cannot be expressed in 
first-order logic. Indeed, a simple compactness argument shows that there 
is a ring ‘i with the same first order properties as the ring Z of integers 
(written ‘t= Z) but where ‘X is not a principal ideal ring. 
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A commutative ring 9 is a field if H is a model of 

Vx Sy [x40—x-y =]]. 

A field R is of characteristic p (p a prime) if R is a model of 
pi=0. 

On the other hand R is of characteristic 0 if 

Vp [p a prime — p1 # OJ. (10) 
Did you catch the weak second-order sentence? The quantifier ranges not 


over $% but over the prime numbers. Thus we must replace (10) by an 


infinite list: 
pl # 0, (10), 


one axiom for each prime p. The result corresponding to Propositions 2.1 
and 2.2 becomes more interesting here. The proof is just like the proof of 
2.1. 


2.8. PRoposiTION. Any first-order sentence yf true in all fields of characteris- 
tic 0 is true in all fields of characteristic p for sufficiently large p, that is, for p 
greater than some integer N,. 


Let us abbreviate the formal term (x-x) by x? and, by induction, 
abbreviate the formal term (x" +x) by x"*'. A field R is algebraically closed 
if it is a model of all axioms of the form 


Vx0°°°Wxn [XnA OO Fy (xn y+ Xniy” te + xy + X0 = OD], 


which says that every polynomial of degree n has a root. 


Set theory 


The first-order axioms for set theory, are discussed at length in Shoen- 
field. The basic language L of set theory has only a membership symbol € . 
The axioms are arrived at by a careful analysis of our informal concept of 
forming sets, sets of sets, sets of sets of sets, and so on into the transfinite. 
The resulting set of axioms is called ZF, after Zermelo and Fraenkel. The 
first axiom about sets one thinks of is the axiom of extensionality: a set is 
completely determined by its members. This becomes 


VxVy[Wz(zExozey)ox=yl. 
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Properties of mathematical theories 


The various first-order theories we have discussed above have radically 
different properties from a logical point of view. Let us mention a few of 
them. 

The theory of abelian groups is a decidable theory, whereas the theory of 
groups is undecidable. That is, one can give an effective procedure which 
will tell of an arbitrary sentence % involving + and 0 whether or not & isa 
logical consequence of (1)}-(4), i-e., whether or not w is true in all abelian 
groups. There can be no such procedure for the theory of groups. This sort 
of question is dealt with in Chapter C.1 and, more fully, in Chapter C.3. 

The theory of algebraically closed fields of a fixed characteristic is a 
complete theory, which is to say that any two algebraically closed fields 
F,, F, of the same characteristic have all the same first-order properties, 
i.e., F, = F,. On the other hand, most of the first-order theories are not 
complete. For example, to the theory of rings we can add either 
Vx dy [x#0—>x-y =1] or its negation “Vx Jy [xx 0—x+y = 1] and 
have a consistent theory. This just amounts to the triviality that some rings 
are fields and some are not. Combining the above mentioned completeness 
of the theory of algebraically closed fields with the Completeness Theorem 
shows, by Theorem 7.2 in Chapter C.1, that the theory of algebraically 
closed fields of characteristic 0 is decidable. Consider the effective proce- 
dure P for deciding whether or not a sentence involving +,:,0,1 is a 
consequence of this theory. Since all models of this theory have the same 
first-order properties, we can apply P to decide which sentences involving 
+,+,0,1 are true in the field C = (C, + ,-,0,1) of complex numbers. This is 
expressed by saying that the field C of complex numbers is a decidable 
model. 

Gédel’s famous Incompleteness Theorem shows that the ring Z of 
integers is not decidable. Thus, any mechanical procedure which attempts 
to decide of sentences w& involving + ,+,0,1 whether or not y is true in Z 
must fail for infinitely many sentences. A consequence of this is that any 
effective list T of true axioms we write down about Z must inveitably yield 
an incomplete theory (since otherwise the argument used on C would work 
on Z). Gédel’s Second Incompleteness Theorem in fact tells us how to go 
about finding a sentence w true in Z but not a consequence of T. Chapter 
D.1 contains a thorough discussion of Gédel’s Incompleteness Theorems. 
These results are usually stated in terms of the structure N= (N, +,°,0,1) 
of natural numbers, rather than in terms of the ring Z. The standard 
definition of Z from N shows that the results apply equally to Z. 
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There are a number of topics that could be gone into at this point, but it 
is more reasonable to let the topics speak for themselves in the chapters 
that follow. Chapter A.2 discusses the basics of the theory of models for 
first-order logic. Chapter A.3 treats the ultraproduct construction, an 
algebraic version of the compactness theorem. Chapter A.4 will also be of 
particular interest to algebraists, treating as it does, model theoretic 
analogues of the notion of ‘‘algebraically closed” and their applications in 
algebra. The fundamental results of Ax-Kochen and Ersov are discussed in 
both of the chapters. 


3. The formalization of first-order logic 


Let L be a given set of function symbols, relation symbols and constant 
symbols. We make no restriction on the size of the set L, though usually L 
is finite or countably infinite. Each function symbol f € L has a positive 
integer #(f) assigned to it; if n = #(f), then f is called an n-ary function 
symbol. Similarly, each relation symbol REL comes with a positive 
integer #(R); if n = #(R) then R is said to be an n-ary relation symbol. 


Examples. For the language L = { + ,0} appropriate to group theory there 
are no relation symbols and #(f)= 2. For the language L={€} of set 
theory, there are no functions or constant symbols and #(€ ) = 2. 


Given a language L we have a natural notion of structure or model for L. 
A structure 2? assigns a nonempty collection M of objects over which the 
quantifiers range, and Yt also assigns appropriate interpretations of the 
basic primitive relation, function and constant symbols of L. 


3.1. DEFINITION. A (set-theoretic) structure for L is a pair Yt=(M, F) 
where M is a nonempty set and F is an operation with domain L such that, 
writing x” for F(x), 
(i) if R EL is an n-ary relation symbol, then R™C M"; 
(ii) if f € L is an n-ary function symbol, then f”:M"—> M; 
(iii) if c € L is a constant symbol then c” € M. 


One often writes Dt as (M,R™,...,f",...,c%,...). The parenthetical 
adjective ‘‘set-theoretic” in 3.1 is there because one sometimes wants to 
consider more generous notions of structures where M may be too large to 
be a set. For example, the natural structue Yt for the language L = { € } of 
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set theory has domain M the collection V of all sets, which is not itself a 
set. However, a consequence of the Completeness Theorem is that any set 
of axioms that has a model.in any reasonable sense will have as a model a 
reasonably small set theoretic structure. We henceforth deal only with 
set-theoretic structures. 


Example. If L = {+ ,0} is the language appropriate to group theory, then a 
structure for L has the form St = (M,+™,0”) where M is a nonempty set, 
+“:MxM-—M and 0” € M. We usually use G rather than Yt and drop 
the superscripts. 


We now turn to syntactic notions of first-order logic. Recall the basic 
building blocks a, v, 7,—, =, V,4, x, y, Z,...,),(, mentioned early in Sec- 
tion 2. Let L be a fixed language. Any finite sequence, each element of 
which is one of these basic symbols or an element of L, is called an 
expression. From the set of expressions we want to single out the ones to 
which we can assign a meaning. 


3.2. DEFINITION. The terms of L form the smallest set of expressions 
containing the variables x, y, z,..., all constant symbols of L (if any) and 
closed under the formation rule: if t,..., t, are terms of L and if f € Lis an 
n-ary function symbol, then the expression f(t;---t,) is a term of L. A 
closed term is a term in which no variable appears. 

If there are no function symbols in L then the formation rule is vacuous 
so the only terms are variables and the constants of L. 


Example. If L={+,0} then, strictly speaking, the terms are expressions 
like 
+ (xy), + (0+ (x0)). 
We naturally agree to abbreviate these by the more natural 
x+y, 0+(x +0), 


respectively, thus moving the symbol + inside and leaving off the outer 
parentheses if no confusion arises. As in Section 2 we use nx as an 
abbreviation of (-:-((x +x)+x)+-:-:-+.x), n times, for n =1. For this 
language the only closed terms are the expressions built up from 0 and +, 
none of which are very interesting from a group theoretic point of view. 


3.3. DEFINITION. An atomic formula of L is an expression of either of the 
two forms: 
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(ti = t2), R(t... tr) 


where, in the first case, t, and t, are terms of L. In the second case R € L is 
any n-ary relation symbol and ¢,,...,4, are terms of L. 


Examples. In the language L = {+ ,0} of group theory there are not any 
relation symbols, so the only atomic formulas are statements of equalities 
between terms, expressions like 

(x+y =z), (x+y=ytx), (x+y)+z=x+(y+2z). 


In the language L = {€} of set theory where all terms are variables, the 
only atomic formulas are those of the form (v= w) and € (vw) for 
variables v, w. We write the latter as v € w. 


3.4. DEFINITION. The first-order formulas of L form the smallest set of 
expressions containing the atomic formulas and closed under the following 
formation rules: 

(i) If ~, are formulas so are the expressions 


a9, (pad), (ev), (P>¥); 


(ii) if-¢ is a formula and v is a variable, then (Jug) and (Wv¢@) are 
formulas. 

We associate parentheses to the right in strings where the same symbol is 
repreated. Thus 9g Awa @ is(y A(W a 6)) and yp > fb > 6 is (y > (W > @)). 


Example. Let L={+,0}. The following are formulas: 
(x+y = 0), 
(Ay (x + y = 0)), 
(Wx (Ay (x + y = 0))) 


The last is what we wrote more informally as sentence (3) in Section 2. 
Note that in the first formula both x and y are sort of ‘‘floating free’, in the 
second formula y is “‘bound up” by 3 and in the last formula both x and y 
are “‘bound’”’. Only the last formula makes any intuitive sense as an axiom. 
This is similar to the situation in elementary calculus where 


x7+2x4+1 


is an expression which has a variable in it, but the expression 


[ (x? + 2x + 1) dx 
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has x ‘‘bound”’; it has a meaning independent of x. The definite integral is 
performing roughly the same syntactic role that the quantifiers 3 and V 
play in logic. The next definition makes the notion of “‘free variable”’ 
precise. One can think of it as defined by induction on the length of the 
formula ¢. 


3.5. DEFINITION. The set FV(¢) of free variables of a formula ¢ is defined 
as follows: 
(i) If g is an atomic formula, then FV(¢) is just the set of variables 
appearing in the expression 9g, 
(ii) FV(4¢) = FV(¢), 
(ili) FV(g 4%) = FV(g vb) = FV(e > &) = FV(e) UFV(p), 
(iv) FV(Av¢) = FV(Woe) = FV(¢) — {v}. 


It is common practise to use the notation ¢(v,,..., v.) to indicate that 
FV(¢)C{v,,..., v.} without implying that all of v,---v, are actually free 
in g. This is similar to the practise in algebra of writing p(x:,...,Xn) fora 
polynomial p in the variables x,,...,x, without implying that all of them 
have nonzero coeficient. 


3.6. DEFINITION. A (first-order) sentence of L is a formula without any free 
variables. 


So far the terms, formulas and sentences of L are simply finite strings of 
symbols. We must make sure to assign the intended meanings to our logical 
symbols so that the formulas of Section 2 express what we intend. This is 
done by defining the satisfaction relation Wk ~ between structures on the 
one hand (the left one) and sentences on the other. 

Let Dt = (M,...) be a structure for a language L. An assignment in M is 
a function s with domain the set of variables of L and range a subset of M. 
We think of s as assigning a meaning s(v) to the variable v. We can then 
define, for each term ¢ of L a function ¢” which maps assignments to 
elements of M. 


3.7. DEFINITION. Let M be given. For ¢ aterm of L define t™ as follows: 
(i) If t is a constant symbol c, then ¢”(s)= c™ for all s; 
(ii) if t is a variable v, then ¢”(s) = s(v) for all s; 
(iii) if ¢ is the term f(t,---+t,) then, for all s, define 


ts) = f™(t7T(s),..., t2(s)). 
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In (iii), since each of t,...,f, is simpler than tf we can assume by 
induction on (the complexity of) terms that t7,..., t%" are already defined. 
f® is defined since M is a structure for L and f € L. The reader should note 
that if s,(v) = s.(v) agree on all variables v appearing in ¢, then t™(s,) = 
t™(s.). Thus ¢™, as a function, depends on only a finite number of values of 
its argument s. 


Example. Let L be the language of rings and let t be the term, or 
polynomial, 
x?+2x 41. 


Then ¢”, for any ring §, is the corresponding polynomial function from ¥ 
into KR. If s(x)= a, then ¢"(s)=a’+2a+1, the operations of + and - 
being those of the ring &. 

In the following definition we use s(¢) for the assignment s’ which agrees 
with s except that s'(v) = a. 


3.8. DEFINITION. Let 2% be an L-structure. We define a relation 
Me g[s], 


(read: the assignment s satisfies the formula ¢ in Q?) for all assignments s 
and all formulas ¢ as follows. 
(i) ME (t, = t2)[s] iff t7"(s) = c3%(s), 
(ii) ME R(t ++ t.)[s} iff (¢7(s),..., Os) ER™, 
(iii) ME e[s] iff not ME e[s], 
(iv) ME (ep aw)[s]} iff ME e[s] and MF y[s], 
(v) ME(e vw)[s] iff DOE p[s] or ME Y[s] or both, 
(vi) ME(~e + &)[s] iff either not Me g[s] or else ME y[s], 
(vii) Me (Av¢e)[s] iff there is an aE M such that Me o[s(2)], 

(viii) ME (Vve){s]} iff for all ae M, ME e[s(C)]. 

There is nothing surprising here. It is just making sure that each of our 
symbols means what we want it to mean. There is one possibly confusing 
point, in (i), caused by our using = for both the real equality (on the 
right-hand side) and the symbol for equality (on the left). Many authors 
abhor this confusion of use and mention and use something like = or ~ 
for the symbol. 

The reader should observe that the truth or falsity of Pk e[s] depends 
only on the values of s(v) for variables v which are actually free in g. That 
is, if s,(v) = s.(v) for all v free in ¢, then MF —[s,] iff ME p[s2]. Thus, if ¢ 
is p(vi:--v,) and a,=s(v,),...,€, =S(v,), then we may write 
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ME p[as,..., an] for ME p[s] without confusion. Also, if g is a sentence, 
then the truth or falsity of YF e[s] is completely independent of s. Thus 
we may write Ik — (read: M is a model of ¢, or M satisfies ¢) if for some 
(hence every) assignment s, PtF ¢[s]. 

If g(v) is a formula and ¢ is a term then ¢(t/v) (or, more simply, ¢(t)) 
denotes the result of replacing all occurrences of the free variable v by the 
term ¢ throughout. When using this notation we always assume that none 
of the variables in tf occur as bound variables in g. If they did we could 
always rename the bound variables. Otherwise we would distort the 
meaning of y(t). For example, if ¢ is w and g(v) is dw (v# w), then ¢(t) 
should assert Jw'(w# w’), not dw (w# w) 

A structure Qt is a model of a set ® of sentences if PE ¢ for all g € @. 
Given two structure Dt, 3% for L, we say that Mt and X are elementarily 
equivalent, and write It = MN, iff for all sentences g of L, Meo iff Nk ¢. If 
M=MN (i.e. M is isomorphic to M, in the obvious sense) then M =. 
Finally, let # be a class of structures for a language L. # is (finitely) 
axiomatizable if there is a (finite) set ® of first-order sentences of L such 
that, for all structures Dt, Pt € H iff Mt is a model of ®. This agrees with our 
terminology in Section 2. Some authors call a finitely axiomatizable class an 
elementary class, or EC. They are then forced into calling an axiomatizable 
class elemenatary in the wider sense, or EC, 


4. The Completeness Theorem 


Surely the most important discovery for mathematics by the ancient 
Greeks was of the notion of proof, turning mathematics into a deductive 
science. Each theorem ¢ must have a proof from a set T of more or less 
explicitly stated assumptions, or axioms. The proof must demonstrate that 
the conclusion ¢ follows from the axioms in T by the laws of logic alone. 
The mathematician implictly assumes that he understands the notion of 
proof and that, in particular, he will be able to check in a rigorous manner 
whether a purported complete proof does indeed establish the conclusion 
from the stated assumptions. The natural question is: Can the notions 
“laws of logic” and ‘“‘proof’”? be made mathematically precise? 

In this section we want to show that there is a mathematically precise 
notion of ‘‘g is provable from T”’ which captures completely the intuitive 
notion ‘“‘g follows from T by the laws of logic alone’, for first-order g and 
T. More fully, we want to provide a concrete set of obviously valid rules of 
inference such that ¢ follows from T by the laws of logic alone if and only 
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if there is a proof of g from axioms in T which uses only the permitted 
tules of inference. 

There is a seeming obstacle to our program. How can we hope to prove 
such a result without knowing in advance what it means to follow by the 
laws of logic alone? Luckily, we do not need to know. All we need is to 
agree that, whatever it means, it at least implies that ¢ will hold in all set 
theoretic structures which are models of T; i.e., it implies TF ¢. Thus, to 
realize our goal, it more than suffices to provide valid rules of inference and 
show that TF ¢ if and only if ¢ is provable from T. This is the content of 
Gédel’s Completeness Theorem. 

The plan of this section is as follows. In 4.1 and 4.2 we take care of 
so-called propositional logic. In 4.3-4.8 we discuss a method, due in 
essence to Henkin, for reducing certain problems of first-order logic back 
to problems about propositional logic. The proofs of the Compactness 
Theorem and the Léwenheim-Skolem Theorem fall out of this method. 
Finally we present two different versions of the Gédel Completeness 
Theorem which are consequences of 4.8., a Hilbert-style formal system 
(4.9) and a Gentzen-style formal system (4.13). 


Propositional logic 


It is expeditious to break the study of first-order logic up into two parts, 
the trivial part having to do with the propositional connectives a, v, 7,—, 
and then the part having to do with equality and the quantifiers V and 3. 

Let P be a set of objects called prime formulas. They might be sentences 
of some natural language or letters p, g, r,... of the alphabet, for example. 
In our application, they will be those first-order formulas which are not 
propositional combinations of simpler formulas, that is, atomic formulas 
and formulas beginning with a quantifier. The set of propositional formulas 
of P form the smallest set of expressions containing the members of P and 
closed under the rule: if A, B are propositional formulas then so are 7A, 
(A -B), (A v B) and (A — B). The prime constituents of a propositional 
formula A are just the prime formulas out of which A is built. 


Examples. Suppose P = {p, q, r}. The following are propositional formulas 
of P: 
P a (pvq) (vp), (pvq)>@vp)). 
We want to show exactly how the truth or falsity of a propositional 


formula depends on the truth or falsity of its prime constituents. Then, 
going a step further, we show how to decide which propositional formulas 
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are always true, regardless of the truth or falsity of their prime constituents, 
formulas like (p v 1p), ((p—q)A—q)— —p), etc. Such formulas are 
called propositional tautologies, since they are true by virtue of their 
syntactic form alone. These tautologies provide a small first step in 
isolating the laws of logic. 

Let t and f be distinct new symbols, thought of as ‘‘true”’ and “‘false”’. A 
truth assignment for a set P of prime formulas is, by definition, a function 
v: P— {t, f}. For each truth assignment v we define its extension p to the 
set of all propositional formulas of P by induction on length of formulas as 


follows: 
b(A)= v(A) if A is prime; 


p(mA)=f if v(A)=t, 
=t if D(A)=f; 
v(A AB)=t if v(A)= »(B)=t, 
=f otherwise; 
v(A vB)=t if #(A)=t or »(B)=t or both, 
=f otherwise; 
p(A > B)=f if o(A)=t and o(B)=f, 


=t otherwise. 


This definition can be summarized by means of the following truth table : 


A B aA (AaB) (AvB) (A>B) 


By constructing such truth tables we can completely analyze how the truth 
or falsity of a propositional formula depends on the truth or falsity of its 
prime constituents. We illustrate the method for the formula 


((7@474)4q)> P). 
We simplify the table by leaving out some of the t’s. 
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Thus, the only circumstances under which our final formula is false is when 
p is false and q is true. 


4.1. Derinition. A propsoitional formula A of P is a tautology if »(A)=t 
for all truth assignments v : P — {t, f}. A is consistent if 0(A)=t for some 
v:P—{t, ff. 


The method of truth tables makes it a trivial matter to see whether a 
propositional formula is a tautology or not, or whether it is consistent or 
not. If we write A<B for (A Be a then we see that the 
following are tautologies: 


“(Av-A) (law of the excluded middle), 


—(A aA) (law of contradiction), 
(A ArAB)e@(7A v ea (de Morgan’s laws), 
(A v B)@ (FA aaB) 

7A <A _ (law of double negation). 


Just to make sure the method of truth tables is perfectly clear, we present 
an example with three prime constituents p, q,r: 
[(p rq) r)a(7r>q)]> (p> 7) 
———— —— 


— 
A B Cc 


pPqnre prq A ar B AaB C [AnB]>C 


t t t f 

t t f f f f 
t f t f f 

t f f f f f f 
f t t f f 

f t f f 

f f t f f 

f f f f f f 
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Thus, since no falses turn up in the last column, the formula is indeed a 
tautology. 

In practise, there is a much shorter method to check to see whether a 
formula is or is not a tautology. One works backwards, trying to find a 
consistent assignment which makes the formula false. Applied to the 
above, to make [A » B]—> C false, we need to have (A) = »(B) =t but 
p(C) =f. To make »(C)=f, we must make #(p)=t, v(r)=f. To have 
p(B)=t we must have 7(q)=t, since o(—r)=t..But now we have 
v(p) = v(q) =t and v(r)=f which gives »(A) =f, a contradiction. Thus 
the above formula is a tautology. 

A set T of propositional formulas is said to be consistent (in the sense of 
propositional logic) if there is a truth assignment v such that »(A) = t for 
all A € T. 


4.2. COMPACTNESS THEOREM FOR PROPOSITIONAL Loaic. A set T of proposi- 
tional formulas is consistent if and only if every finite subset of T is consistent. 


PRoor. We present two proofs of the nontrivial half. 

First proof. For the purposes of this proof call a set S finitely consistent if 
every finite subset of S is consistent. We wish to prove that every finitely 
consistent set is consistent. Call S maximal finitely consistent if S is finitely 
consistent and for every formula A, either AES or (MA)ES. 

There is a natural correspondence between valuations v and maximal, 
finitely consistent sets. To any v assign the set S, = {A | »(A) = t}. This set 
is maximal, finitely consistent. Conversely, given a maximal, finitely 
consistent set S, define v(p)=t if pE S, v(p)=f if p¢ S. The following 
facts follows immediately from the fact that S is maximal, finitely consis- 
tent, and imply (by induction on formulas A) that S = S,: 


BES iff (AB)ES, 
(AaB)ES iff AES and BES, 
(AvB)ES iff AES or BES, 
(A> B)ES iff A¢S or BES. 


For example, let’s prove that (AvB)ES implies AES or BES. 
Suppose not. Then (A v B)E S but (7A) € S and (—B)€E S, by maximal- 
ity. But then {(A v B), 4A, —B} is a finite, inconsistent subset of S, a 
contradiction. 

The above remarks show that proving a finitely consistent set T 
consistent is equivalent to finding a maximal, finitely consistent set S D T. 
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We show how to construct such an S in the case where the underlying set P 
of prime formulas is countable. Essentially the same proof works as long as 
P is well-ordered and hence, by the axiom of choice, works for all P. The 
proof for well-ordered P does not need the axiom of choice. 

If P is countable, so is the set of all formulas so we enumerate them: 
A, A2,...,An.-.. Define T.C T,G+::C T, C-++ by 


To = fhe 
Tait = Tn U{An} if this is finitely consistent, 


= 
T, U{A,} otherwise. 


Let S = UT,,. Clearly T C S and for every A, either A € S or (MA)ES. 
To finish the proof we need only show that each T,, and hence S, is finitely 
consistent. This is proved by induction on n with n=0 being the 
hypothesis of the theorem that T is finitely consistent. Assume T,, is finitely 
consistent, and prove that T,,, is finitely consistent. 

Case 1. Tas: = Ta U{An}. By the definition of T,.:, this is finitely 
consistent. 

Case 2. Ty. = T, U{A,}. This can only happen if there is some finite 
set TC T, such that T;, U{A,} is not consistent. Suppose that T,., is not 
finitely consistent. Then there is a finite set TC T, such that T, U{—A,} 
is not consistent. But then T,,U T;, is a finite subset of T, so is consistent. 
Any assignment v making all of T,U 7, true must make one of A, or 
—A, true, contradicting the inconsistency of both T,U{A,} and TZU 
{An}. 

Thus, in either case, 7,4, is after all finitely consistent. This finishes the 
proof. 

Second proof. We can give a faster proof by quoting the Tychonoff 
Theorem. It hides the basic construction, though, and thus is less suitable 
for other constructions in model theory. Let 2 = {t, f} be the two elements 
space with the discrete topology and let X = 2°, the space of all truth 
assignments of P with the product topology. By the Tychonoff Theorem 
X is a compact, Hausdorff space. Hence if ¥ = {F,|i € J} is an indexed 
family of closed subsets, and if {\.<:F, =@, then there is a finite I,C I 
such that (<1, F, = @. For each propositional formula A, let F, = {v € X| 
p(A)=t}. We claim that each F, is clopen (both closed and open) in X. 
For A = p a prime formula F, is open by the very definition of the product 
topology. But X — F, = {v| v(p) = f} is also open, by definition, so F, is 
clopen. For more complicated formulas, the claim follows by induction on 
length of formulas and the following equations: 
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Fuava)= F, U Fa, Fuans) =F, Fa, 
Fiass)= Fe — Fa, F_, = X — Fa. 


This establishes the claim. Now let T be as given in the theorem. By 
hypothesis, for each finite T, C T, there is a » making all g € Ty true, i.e. 
Nem Fs #0. By the compactness of X, MacrFs# 9. Thus, there is a 
truth assignment » making all AG T true. O 


The standard classroom example of a simple application of the Compact- 
ness Theorem for Propositional Logic is to prove that if an infinite map 
cannot be colored by k colors then some finite submap cannot be colored 
by k colors. To prove this one assigns k prime formulas to each “country” 
on the map, one for each color, and writes down the obvious “axioms” 
asserting that each country gets exactly one color “‘true’’ and that adjacent 
countries do not have the same color “‘true’”’. Another example, if one gives 
the first proof of the Compactness Theorem, is to prove the Tychonoff 
Theorem for 2”. 


The use of Henkin constants for reducing first-order logic to 
propositional logic 


In this subsection we apply the notions of propositional logic to 
first-order logic. Given a language L let P be the set of formulas of L which 
are atomic or begin with V or J. Thus, a tautology of first-order logic is any 
formula which is true regradless of what truth assignment is given to the 
prime formulas. For example the following are tautologies, 


VxR(x)v TVx R(x), 
“(Wx R(x) a dx S(x))<(AVx R(x)v 44x S(x)), 


but the following sentences are not tautologies: 


(c=), 
Vx (R(x)v AR (x)), 
74x S(x) > Vx AS(x). 
The first two are prime formulas, the third has the form —p — q for prime 
p and q. We see that the tautologies of first-order logic barely scratch the 


surface of the collection of “laws of logic’. 
We state an obvious lemma for the record. 
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4.3. Lemma. Let 0=(M,...) be a structure for a language L and let s be 
an assignment in Ut, i.e., a function mapping the variables of L into M. 
There is a truth assignment v to the prime formulas of L such that, for all 
formulas ¢ of L, ME els] if and only if 1(¢) =t. In particular, any set of 
sentences true in Yt is consistent in the sense of propositional logic. 


Proor. For g a prime formula, define v(y)=t if Eels], otherwise 
v(¢) = f. Since every formula is built up from prime formulas by means of 
propositional connectives, the conclusion is obvious. UL 


The converse of the lemma is far from true. For example the following 
set of sentences is consistent in the sense of propositional logic (they are all 
prime formulas) but has no model: 


{Vx (R(x) S(x)), Vx R(x), dx 7S (x)}. 


There has been no analysis of the quantificational structure of the- 
sentences. 


4.4. Equatity axioms. The equality axioms are as follows, where 
u, W, u,,... denote variables and constant symbols of L: 


(u =u), 
(u=w)—>(w =u), 
(U; = U2 A U2 = U3)—> (U; = Us), 
(U, = WiA+** Aun = Wa) (R(u1* ++ Un) R(wi- ++ Wad); 
(U, = Width A Un = Wa) (tus Un) = (Wiss? Wad), 


where R is an arbitrary n-ary relation symbol of L and ¢ is an arbitrary 
n-ary term of L. The equality axioms are valid in that for all such axioms 
g, all Mt and all assignments s to variables, PE y[s]. The last four axioms 
for equality might well be called Leibniz Law. 


The witnessing expansion L(C) of a language L is constructed as follows. 
Let Cy = @ and, once C, is defined, let L, = LU C,,. For each formula 9 (v) 
of Ly with exactly one free variable let c,,.) be a distinct new constant 
symbol and let C;, be the set of all these c,,.). Given C,, assign distinct new 
constant symbols c,,.) to each formula ¢(v) of L, which is not already a 
formula of L,-, (i.e., if some constant from C, appears in ¢). Let C,.1 be 
C, union the set of all these new c,,.). Let C= U,C, and let L(C)= 
LUC 
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4.5. DEFINITION. The constant symbol c,,.) is called a witnessing constant 
and the sentences 


I (Jv e(v))—> P (Cow), 
Il 9 (Crew) > Voy (v) 


are called Henkin axioms of types I and II. 


The informal idea behind the Henkin axioms is quite simple. If Jug(v) 
is true in a structure, choose an element a satisfying ¢(v) and give it anew 
name Cy). If Vu g(v) is false, choose a counterexample b and call it by the 
new name C—gw). 


4.6. DEFINITION. Tienxin iS, by definition, the set of all sentences of L(C) 
which are either Henkin axioms or else of one of the forms: 


III Vvug(v)— v(t), t aclosed term of L(C); 
IV p(t) Jdve(v), t aclosed term of L(C). 


These latter are called the quantifier axioms. Their informal content is 
clear. 


The set Tyenkin is not true in every L(C)-structure, but the next lemma 
shows that every L-structure can be turned into an L(C)-structure which is 
a model of Trenkin, using the idea discussed following 4.5. 

If LCL’ are languages and t’=(M, F’) is a structure for L’ then 
WW = (M, F'l L) is called the reduct of L' to L and M’ is called an expansion 
of M to the language L’. Thus, Yt and M’ are the same except that Me’ 
assigns meanings to the symbols in L’—L. 


4.7. LEMMA. Let Wt be any structure for a language L and let L(C) be the 
witnessing expansion of L. There is an assignment of elements of Mt to the 
constant symbols of C so that the resulting expansion of IX is a model of 
Tutenkin- ‘ 


Proor. The quantifier axioms III, IV are going to be true regardless, so we 
only need worry about those of types I, II. Suppose we contrive to make 
those of type I true, and consider a typical one of type II: 


9 (Crew) Voe(v). 


Suppose the hypothesis is true in the expansion, but not the conclusion. 
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But then Jv —¢(v) is true, so by I, 4¢(c—,.)) is true, a contradiction. 
Thus, if all axioms of type I are true, so are all of type II. 

We proceed to assign elements to the constants in C, by induction on n. 
If Coco) € C, then Jv —(v) is a sentence of L and hence makes sense in M. If 
ME Ave(v), choose some a EM so that Mk —(a) and set ce) = a. If 
Mr — Av¢e(v), define c2\.) arbitrarily. This makes all the positive Henkin 
axioms about the c,..)€C, true. But once the constants of C, are 
interpreted, all the sentences of L, = LUC, make sense, so we can carry 
out the same argument and assign elements to the c,~)€ C2, andsoon. O 


A canonical structure for L(C) is a structue Dt = (M,...) such that every 
a € M is denoted by some c € C. That is, M = {c™|c € C}. The set Eq 
mentioned in 4.8 is the set of equality axioms of L(C) which are sentences 
of L(C); i.e., those which contain no variables. 

The following lemma may seem rather technical but the equivalence of 
(i) and (iii) show that we have reduced problems about models of first-order 
theories to essentially trivial questions about propositional logic. There is a 
price to be paid, however. Even in the case where the T in (i) is finite, the 
propositional theory in (iii) is infinite. 


4.8. Main LemMMA (The reduction to propositional logic). Let L be a 
first-order language and let L(C) be the witnessing expansion of L. For any 
set T of sentences of L the following conditions are equivalent: 

(i) T has a model; i.e. there is an L-structure Dt which is a model of all 
sentences in T. 

(ii) There is a canonical L(C)-structure It which is a model of all 
sentences in T. 

(iti) TU Trenkin U Eq is consistent in the sense of propositional logic. 


Proor. The implication (ii) > (i) is immediate, while (i) > (iii) follows 
from Lemma 4.3. We prove (iii) > (ii). Let v be a truth assignment to the 
prime sentences of L(C) such that v(¢) = t for all gp © T U Tyrenkin U Eq. To 
prove the lemma, we construct a canonical model Yt = (M,...) such that, 
for all sentences g of L(C), 


Meo iff v(p)=t. 


The main function of Tyenkin is to guarantee that » satisfies the following 
conditions: 


v(Ave(v))=t if H(e(cCow)=t 
p(Vvg(v))=t iff v(pe(t))=t for all closed terms t of L(C). 
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The conditions allow us to construct our model 2 out of the constants in C 
in a way that is analogous to the construction of a group from generators 
(the elements of C) and defining relations (the axioms of T). To define Pt 
we must (a) specify the universe M of PM, (b) define, for each n-ary relation 
symbol R €L an n-ary relation R™ to interpret R, (c) define for each 
n-ary function symbol f € L an interpretation f” : M" — M, and (d) define 
for each constant symbol c of LUC an element c”€ M. Having thus 
constructed Pt it will remain only to verify that Me @ iff o(¢) =t, for all 
sentences g of L(C). This condition tells us how we must fulfill conditions 
(b}(d) above. 
(a) Define an equivalence relation ~ on C by 


c=d iff v((c=d))=t. 


The equality axioms gurantee that ~ is an equivalence relation on C. 
Suppose, for example, that c ~ d and d ~ e. We check to see that c ~e. 
Since v(c=d)=t, v(d=e)=t by cd, d=~e and _ since 
v((c =dai =e)—c =e))=t since this sentence is an equality axiom, 
v(c =e)=t so c~e. Let € be the equivalence class of c and let 
M ={é|c EC}. 

(b) Define R™ by 


(E1y.25Gr)ER™ iff v(R(cy,...C.)) =t. 
To see that this is well defined we must check that if 
é=d,...,6=d, and (é,...,6:)€R™, 
then (d;,...,d,)€ R™. This is a consequence of the fact that 
C=, A+++ AG, =a, NR(C1,..., Cn) > R(di,..., dn) 


is an equality axiom and hence is assigned true by ». 

(c) Let c1,...,¢, € C and f € L be given. We claim there is ac € C such 
that v(f(c:---c,)=c)=t. For consider the formula g(x) given by 
(f(e:: ++ Cn) =x). If DGve(v))=t, then v(f(ci-::c.)=c,)=t. So sup- 
pose that v(Ave¢(v))=f. But one member of Trenin is the sentence 
(e(f(c:°** Cx) Av¢y(v)) so that H(p(f(ci---c,.)) = f. But this says that v 
assigns f to the atomic sentence (f(c:--:c,.) = f(¢i'*+¢n)). But o(c, = ci) 
=t,(i=1,...,nm)and P(e, = 1A °° A Cn = Ca) (fF (01° + Cn) = f(r Cn) 
=t since these are equality axioms, which is a contradiction. Thus 
p(Avey(v))=t after all. We can define f"(é.,...,é.)=€ for that cEC 
such that v(f(c.---c,)=c)=t. An argument like that used in (b) shows 
that f” is well defined. 
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(d) Ife € C letc™ = €. If d € L, then an argument similar to that used in 
(c) shows that there is a cE C such that v(d =c)=t so let d™ = for 
this c. 

This completes the construction of Yt and guarantees that for atomic 
sentences MF ¢ iff v(¢) =t. To prove this for other sentences we proceed 
by induction on length of formulas. The propositional connectives are 
trivial. For example, ME(g aw) iff ME gp and Me yw (by definition of F) 
iff v(¢p ) = 0() = t (by the induction hypothesis) iff 7(@ a ¢) = t. Suppose ¢ 
is dx p(x). If 9(¢) =t then, by the condition above, there is a c such that 
p(y (c)) = t so, by induction hypothesis, DE &(c) so ME Ax p(x) so ME og. 
On the other hand, if #(¢) = f then 7(Ax b(x)) = f so by Trentin, P(e (t)) = f 
for all closed terms ¢ of L(C). In particular, for every c € C, ¥(wW(c)) =f. 
By the induction hypothesis, 9% —(c) for all ¢ € C. Since every element 
of M is denoted by some cE C, MEaAxwH(x). Thus MF 3x w(x) iff 
v(Ax w(x)) = t. The proof in the case when ¢ begins with V is similar. O 


The Main Lemma provides a method for actually constructing models of 
theories out of symbols. In particular, it gives us immediate proofs of the 
Compactness and Lowenheim-Skolem Theorems. 


PROOF OF THE COMPACTNESS THEOREM (2.4). Let T be a set of sentences of 
the first-order language L such that every finite subset of T has a model. 
We need to show that T has a model. By (iii) > (i) of the Main Lemma this 
amounts to proving that TU Tuenkin UEq is consistent in the sense of. 
propositional logic. But, by the Compactness Theorem for Propositional 
Logic, it suffices to prove that for every finite subset TC T, 
To U Tuenkin U Eq is consistent, which follows from the hypothesis and 
(i) > Gili) of the Main Lemma. O 


Other proofs of this theorem appear in Chapters A.2 and A.3. It also 
follows directly from the Completeness Theorem below. 


PROOF OF THE LOWENHEIM-SKOLEM THEOREM (2.5). Let « be some infinite 
cardinal and let L be a language with = « symbols. Since every formula of 
L is a finite sequence of symbols, there are = x formulas of L. Recall the 
definition of the witnessing expansion L(C) of L, where C = U,C,. 
Clearly, by induction, each C, has cardinality =< « so C has cardinality 
= «x. Thus, any canonical structure for L(C) has =« elements, so the 
desired result is an immediate consequence of (i) > (ii) in the Main 
Lemma. 0 
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The Completeness Theorem for a Hilbert-style formal system H 


There are several quite distinct approaches to the Completeness 
Theorem, corresponding to different ways of thinking about proofs. Within 
each of the approaches there are endless variations in the exact formula- 
tion, corresponding to which laws of thought are taken as basic, which as 
derived. We will ignore the minor variations. The different basic ap- 
proaches are important, though, for different notions of proof lend 
themselves to different applications. The most important thing to re- 
member, however, is that while there are many notions of proof, there is 
only one real notion of provable for first-order logic, as the Completeness 
Theorem shows. 

The first type of formal system we discuss is a so called Hilbert-style 
formal system. It is usually the favorite of the mathematician because it is 
elegant and easy to remember. In 4.6 we sketch a Gentzen-type formal 
system. This type has proven very useful in analyzing the proof-theoretic 
strength of various mathematical theories. In a classroom situation, 
however, where students actually seem to enjoy working out formal proofs, 
the Fitch-type subordinate proof method or the Beth-type semantic 
tableaus are even better. The latter are discussed at length in SMULLYAN 
[1968]. 

In a Hilbert-style formal system, the emphasis is on logical axioms, 
keeping the rules of inference at a minimum. If we had taken 3x as a 
defined symbol, treating 3x as “Vx —9, the system would have been 
superficially even simpler. It seems somehow more to the point, however, 
to treat the laws of thought behind these quantifiers separately. 

Let L be a fixed first-order language. All formulas below are first-order 
formulas of L, and all terms ¢ are terms of L. Recall our convention in 
Section 3 about writing y(t/v), the result of replacing v by ¢ in g, only in 
case the variables in ¢ do not occur bound in ¢. We write g(t) for ¢(t/v) 
below. 


Axiom Schemata of H 
(1) All tautologies, 
(2) All equality axioms, 
(3) All formulas of either of the forms 


Wve(v))> et)  e(t)>Ave(v). 


Rules of Inference of H 
(1) (Modus Ponens) From (g > &) and ¢ infer y, 
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(2) (Generalization rules) If the variable v is not free in 9, then: 
from ¢ > (v) infer ep > Vy (y), 
from #(v)— ¢ infer Jy b(y)— ¢@. 


These rules are usually written schematically as 


(ew) ¢@ 
p 


If v not free in g, then: 


g—> v(v) ¢(v)> ¢@ 
e>Vybly)’ Ayd(y)-@- 


A proof of ¢ from a set of sentences T (in the formal system H) is a finite 
sequence y,...,%, of formulas, with w, = y, each of which is either an 
axiom of H, a member of T, or else follows from earlier #; by one of the 
three rules of inference. We say that ¢ is provable from T, and write Tt 4g, if 
there is a proof of g from T. 


4.9. GODEL COMPLETENESS THEOREM FoR H. Let T be a set of sentences of a 
language L. A sentence ¢ is provable from T if and only if ¢ is true in all 
set-theoretic structures which are models of T. In symbols, TK ¢ iff Tt. 


The easy half of the Completeness Theorem follows from the Soundness 
Lemma. 


4.10. SounpnEss Lemma. Let t be a set of sentences, Wt a model of T. If 
(v1,..., Un) is provable from TF, then MEVv,---Wv.e (vi, ..., Un). 


Proor. One proves, by induction on a, that if w,..., % is a proof from T 
then WEVoy.--- u(t). OF 


In the proof of the Completeness Theorem we need the following 
lemma. 


4.11. Lemma. Let T be a set of sentences. 
(i) If T+(p—>w) and Tt(m¢ — wb), then Try. 
(ii) If TH(eg > 0)— y, then T+ (Me > wh) and TH(@— Ww). 
(iii) If v does not appear in & and if Tt[(Aye(y)> ¢(v)) > W], then 
Thy. 
(iv) If v does not appear in J and if T+ (¢(v)— Vy o(y))— ¥, then Thy. 
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Proor. (i) Notice that [(¢ > #)— ((7¢ > &)— ¥)) is a tautology. Thus, if 
we write a proof of (pg — w) and apply modus ponens we get a proof of 
(—¢ > &)— &. Then write a proof of (—¢ — w) and apply modus ponens 
to get a proof of y. 

(ii) Note, [(p > 9) #) > (¢ > ¥)] and [(y > 6) > ¥) > (6 > #)] 
are tautologies. 

(iii) Suppose TF[(Ay ¢(y)— ¢(v))— Ww], where v is not free in %. By 
(ii), Tk(m4y ¢(y) = ay) and T+ ¢(v)— w&. Apply the second generaliza- 
tion rule and we have T+ (Ay ¢(y)— w). But then by (i), Tt &. The proof 
of (iv) is similar, but uses the first generalization rule. O 


PROOF OF THE COMPLETENESS THEOREM. Suppose that TF». By the Main 
Lemma (4.8) and the Compactness Theorem for propositional logic, there 
is a finite set S C T U Tyrenkin U Eq such that S U{—¢} is inconsistent in the 
sense of propositional calculus. List the members of S in a list a,..., an, 
B.,..., Bu as follows. The sequence a,,..., an consists of those members of 
S which are either in T U Eq or else are quantifier axioms (types III and 
IV) listed in any order. The B’s are the members of S which are Henkin 
axioms of types I, II, but we must list them more carefully. Recall the 
languages L=L,CL,C--- such that L(C)= U, L,. Define the rank of 
gy E L(C) to be the least n such that ¢ € L,. Now, choose for 8, a Henkin 
axiom in S of maximum rank. Choose for B, a Henkin axiom in S — {6,} of 
maximum rank, etc. The point of arranging things in this way is that the 
witnessing constant about which 8, speaks, is not mentioned in B)+1,..., Bm. 
For example, if B; is 
dv n(v)> N (Caw), 


then c,(.) does not appear in any of the other B2,..., Bu, by the maximality 
condition on f). 
Recalling that S U{—¢} is not consistent in the sense of propositional 
logic, and associating parentheses to the right, we see that 
(a1 a2. ++ * > an > Bim +++ > Bu >) 
is a tautology. Replace each witnessing constant in this sentence by a 
distinct new variable. The result is still a tautology: 


ai > a3 DP aN> Bim > BNm- ' 


but g’=@ since ~ has no witnessing constants in it. Each aj,..., an is 
either a logical.axiom or else is a member of T, so we may apply modus 
ponens N times and obtain a proof of: 


Bi>---> Bie. 
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But now we apply parts (iii) and (iv) of Lemma 4.11 to succesively remove 
B:, B2,...,Bm and obtain a proof of g. O 


Notice that our proof used only the derived rules 4.11 (iii), (iv) so we 
could have used them in place of the more standard rules of H. This is 
discussed in SMULLYAN [1968]. 


The Completeness Theorem for a Gentzen-type formal system G* 


Hilbert-style systems are easy to define and admit a simple proof of the 
Completeness Theorem but they are difficult to use. Gentzen systems 
reverse this situation by emphasizing the importance of inference rules, 
reducing the role of logical axioms to an absolute minimum. 

We use I, A to range over finite sets of formulas. A sequent is a pair 
(I, A) which is written + A and read, informally, as [ yields A or, rather, 
the conjunction of all the formulas in I’ yields the disjunction of all the 
formulas in A. We write I',g for F U{g}. 

We first restrict attention to propositional logic. ~ 
Axioms: I, ot A, @. 


Rules: 
Tg,wta rta,¢ reap 
(1) T@adra (Ka) THA @ Ad) 
ie) r,etA T,wta (ex) PA; 
I(pvp)ra rra,(g vp) 
ra,¢ Teta 
ae? I,7@tA (ey) '+A,-@ 


rae  Tytd (aes 
F(e> ra rae) 


A derivation in this system is a finite tree of formulas like, e.g. 


(>r) 


with axioms at the upper nodes and such that each sequent on the tree 
follows from the ones immediately above it by one of the rules. Rather . 
than define this precisely, we given an example, a derivation of the sequent 


((e aa) v O)E(@ v 8): 
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g. Wl 9, 6 (axiom) 


(erry)tg,@ (by at) 6+9,6 (axiom) 


(par b)kK(ev6) (bytv) Or(pve) (byFv) 
((errb)vO)F(evG) (by vF) 


4.12. THE COMPLETENESS THEOREM FOR THE PROPOSITIONAL FRAGMENT OF 
G. Let [A be finite sets of propositional formulas. The following are 
equivalent: 

(i) Every truth assignment v making all ¢ © I true makes at least one 
EA true. 

(ii) There is a derivation of [+ A using the above axioms and rules. 


Proor. The proof of (ii) > (i) is easy by induction on the length of the 
derivation. For the proof of (i) > (ii), start with a pair (I, 4) satisfying (i). 
We attempt to build a derivation of [+ A by working backwards. At each 
step, we work on a formula in ! UA of maximal length, breaking it apart 
by means of one of the rules. For example, if (~— 6)E TI is the longest 
formula in F UA then, at the first stage, we write down 


Tat A, p Iy,etA 
i.@oera 8) 


where I) = [' — {(ys > 8)}. We now work on [ot 4, & and Io, gp +A sepa- 
rately. Eventually we end up with sequents which cannot be broken down 
further. If each of the sequents on the ends is an axiom, i.e., has a prime 
formula common to both sides, then we have constructed a derivation of 
I+ A. So suppose that one of these end nodes [’’} A’ is not an axiom. 
Define v on I’UA’ by 
t if pel’, 

v(p) = “ if ped’ 
and define vy arbitrarily on other prime formulas. This is possible since 
I''N A'=9. A case by case examination of the rules shows that every 
sequent ’”t A” beneath I’'t A’ also gets t assigned to everything on the left 
by » but f to everything on the right. In particular, this happens to [+ A, a 
contradiction. OO 


To pass from propositional logic to first-order logic we add equality 
axioms, an equality rule and four quantifier rules. 
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Equality axioms: [+ A,(t = t) (t any term). 
Equality rule: If E is (t; = t2) or (t2= t,), then 


Te (tb A, w(t) 


TE, e(t) tA, b(t)’ 


p(t) A, o(t) 
VE Fyne 2 Feadeew) 


In the next two rules, the variable v is not allowed to occur freeinT UA: 


+A, e(v) Te(v)ta 
KV dt : 
{Es} rra,Vye(y) Aah) I,Aye(y)rka 


The formal system G has the above axioms and rules of inference. The 
system G* has, in addition, a rule called ‘‘cut’’ which is the counterpart of 


modus ponens: 
I,gta tA, ¢ 


cut: Tra 


Given a set T of sentences of L, we say that a formula ¢ is derivable 
from T in G* if there is a derivation of [+{g} for some finite I C T. 


4.13. THE CoMPLETENESS THEOREM FOR G". A sentence ¢ is derivable from a 
set T of sentences in the system G* iff TF g. 


Proor. The (=>) direction follows by a Soundness Lemma entirely 
analogous to 4.10. To prove (<) we use the Main Lemma, the Compact- 
ness Theorem for propositional logic and 4.12. By the Main Lemma and 
the Compactness Theorem for Propositional Logic, there is a finite 
SC TU Thenkin U Eq such that every truth valuation » making S true 
makes g true. Using the exact notation as in the proof of 4.9, let 
S = {aj,..., an, Bi,..., Bw}, and let aj,...,a@n,B1,...,Bm be as before. 
Thus, if  ={a@i,...,a@n, Bi,..-, Bu}, we see that [+g is derivable in G* 
using only the axioms and rules of propositional logic, by 4.12. Let 
I) ={a},...,ax}, K = N, be the subset of {a{--- aa} of members of T. We 
can turn the derivation of [’ ¢ into a derivation of [+g by means of the 
following claims. 
(1) If @ is an equality axiom or a quantifier axiom III or IV, then 9+ ¢. 
(2) If Ft A, then FUI'FA UA’. 
These are simple and from them we obtain It q@; for all J<i=N so we 
can apply cut N—J times and get a derivation of Ty U{Bi,..., Batt @. 
(3) If 1, (¢ > @)t A, then P+ A,g and IF, @tA. 
This is similar to 4.11(ii). We prove the first. 
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I,e—>6ta (hypothesis) Pet, 6,¢ | (axiom) 
[,e—>60+tA,p (by 2) rraje>6)e (t>) 


rtA,ge (cut) 


The second is similar. Combining these, the rules (+ V) and (3+), and cut 
we obtain: 
(4) If F,dy¢(y)— e(v)FA and v is not free in F UA, then Pb A. 
(5) If £, ¢(v)— Vve(v)t A and v not free in PUA, then Fr A. 
Using these derived rules we remove B(,..., Bm from the hypothesis and 
obtain a derivation of [ytgy. O 


We have used the cut rule heavily in our proof of the Completeness 
Theorem for G*, but it did not enter into the proof for the propositional 
part. Is it really necessary? No. One can, by working directly with the 
system G, and expanding on the proof of the propositional part, prove the 
completeness of the system G. See, e.g. SMULLYAN [1968]. This shows that 
the cut rule can be eliminated. Historically, things went the other way 
round, and constituted the first important chapter in proof theory after 
Gédel’s Incompleteness Theorems. 


4.14. Cut-EvIMINATION THEOREM (Gentzen). Any sequent which has a 
derivation with the cut rule has one without it. I.e., G* and G have the same 
derivable sequents. 


Gentzen’s proof was by a complicated double induction, showing how to 
transform any derivation allowing the cut rule into a derivation without 
cut. By analyzing such inductive proofs one is able to a obtain precise least 
upper bound to the methods of induction which are provable in first-order 
arithmetic. This topic is treated in detail in Chapter D.2 on cut-elimination. 
The system used there is simpler since only formulas in so called ‘‘negation 
normal form” are considered. 

This concludes our discussion of the Completeness Theorem. We leave 
the reader with the instructive exercise of proving Jy [p(y)— Vx ¢(x)] in 
the systems H,G and G*. The informal proof is: pick y so that 4¢(y), if 
there is such a y; otherwise let y be arbitrary. Thus, if p(y), then Vx p(x). 
The proof in G* is similar. 
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5. Beyond first-order logic 


Many logicians would contend that there is no logic beyond first-order 
logic, in the sense that when one is forced to make all one’s mathematical 
(extra-logical) assumptions explicit, these axioms can always be expressed 
in first-order logic, and that the informal notion of provable used in 
mathematics is made precise by the formal notion provable in first-order 
logic. Following a sugestion of Martin Davis, we refer to this view as 
Hilbert’s Thesis. 

The first part of Hilbert’s Thesis, that all of classical mathematics is 
ultimately expressible in first-order logic, is supported by empirical evi- 
dence. It would indeed be revolutionary were someone able to introduce a 
new notion which was obviously part of logic. The second part of Hilbert’s 
Thesis would seem to follow from the first part and Gédel’s Completeness 
Theorem. Thus Hilbert’s Thesis is, to some extent, accepted by many 
mathematical logicians. 

Even those who accept Hilbert’s Thesis in theory, however, are a far cry 
from accepting it in practice. It would be completely impractical and, in 
fact, counter-productive, to always make all one’s extra-logical assump- 
tions explicit. 

Let us reconsider a couple of examples from Section 2. 


Example. The axiom Vx dn = 1(nx =0) expressing the torsion property 
for abelian groups is not a first-order axiom (by 2.3). If we were to apply 
Hilbert’s Thesis in this case, we would have to axiomatize not only group 
theory but also the properties of natural numbers needed to carry out the 
arguments we were after. This would mean that the theory of torsion 
groups encompasses all of first-order number theory, something clearly not 
in the spirit of modern algebra. 


Example. The notions of metric space and Hilbert space are relatively 
simple, modulo the ordered field R of real numbers. As we saw in Section 
2, however, R is only categorical relative to set theory. It would be counter 
productive, though, even to pretend to formulate all ones theorems about 
metric spaces within some formal system of set theory, when most of what 
one wants to do is first-order modulo the field R. There is no point in 
saddling the study of metric spaces with any more of the problems inherent 
in set theory, unsolvable problems the likes of which mathematics has 
never known before, than necessary. 
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The examples show that it is common mathematical practice to accept 
certain notions and structures as basic and work axiomatically from there 
on, even when we are aware that these notions cannot be completely 
axiomatized in the restricted language of first-order logic in and of 
themselves. The algebraist constantly takes the notion finite as basic. The 
student of analysis, metric spaces and Hilbert spaces begins with the 
structure R of reals. Logicians have developed strengthenings of first-order 
logic which allow him to be more faithful to this mathematical practice, 
logics which absorb certain mathematical notions, or structures, into the 
logic, in the same way that the algebraist attempts to absorb the notion of 
finite into his informal logic. In this section we briefly discuss some of these 
extensions of first-order logic. 


5.1. Many-sorted first-order logic 

Two-sorted first-order logic is just like ordinary first-order logic except 
that one has two distinct sorts of variables. For example, the natural way of 
writing the axioms for vector spaces is to have one sort of variable r, s, t,... 
over scalars (elements of a field %) and a different sort v,w,... over 
vectors. A vector space consists of a triple (%, ¥B,-) where & is a field, 
@=(V,+,0) the structure of vectors with vector addition, and the 
operation - of scalar multiplication. In general, a two-sorted structure 
(Mt, N,...) consists of two ordinary structures plus some functions and 
relations on their union. Two-sorted (or many-sorted) logic is only superfi- 
cially stronger, though often more natural, than ordinary logic, since we 
can always take a structure (2, ¥t,...) and turn it into a ordinary structure 
(MUN,M,N,...,...,.-.) with unary predicates M and N to sort out the 
different sorts of elements. This reduction allows most results of first-order 
logic to be transferred to many-sorted logic, and is part of the evidence for 
Hilbert’s Thesis. Malcev and Feferman, among others, have stressed the 
advantages of working directly with the many-sorted case. A good intro- 
duction can be found in FeEFERMAN [1974]. 


5.2. w-logic 

If we take a two-sorted language and consider two-sorted structures 
(Dt, N,...) with a fixed structure N, then we obtain so called t-logic. For 
example, R-logic is appropriate to the study of metric spaces and real 
Hilbert spaces. For N the structure of natural numbers, N-logic is usually 
called w-logic. It is appropriate to the study of, say, Euclidean rings, since a 
Euclidean ring is a ring Ni with a function d: SiN satisfying the usual 
first-order axioms. As long as ¥ is infinite, Jt-logic is stronger than 
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first-order logic. For example, the Compactness Theorem must fail for 
N-logic. 


5.3. Weak second-order logic 

Weak second-order logic is an attempt to build the notion of finite into 
logic in a natural way. Let L be a given first-order language. Let x, y,z be 
the variables of L. From L we form a new two-sorted language L* which 
has variables a,b,c and a membership symbol €. Given a structure 
M=(M,...) for L we expand it to a structure HF(M) for L*, called the 
structure of hereditarily finite sets on IN, as follows. Let 


HF,(M) = 90 
HF,,,(M) = {all finite subsets of M UHF,(M)}, 


HF(M)= U HF,(M). 


Then HF(2) = (2, HF(M), €1(M UHF(M))). In weak second-order 
logic (more accurately called weak finite-type logic) we allow ourselves to 
use formulas of L* and to interpret the set variables a, b,c over HF(M). 
Anyone familiar with the development of intuitive set theory in, say, ZF, 
will realize that we can define the natural numbers in HF(2t), and the 
notions of finite sequence in HF(2t). In fact, HF(2) is admissible (see 
Chapter A.7 on admissible sets, in particular, 3.1 and 2.16), so we can 
define functions by recursion. In particular, all of the sentences of Section 2 
which we called weak second-order are easily seen to be weak second- 
order in this precise sense. Weak second-order logic has essentially the 
same strength as w-logic, but is much more natural in the context of 
algebra, since one can work directly with integers, finite set, finite 
sequences, etc. 


5.4. Infinitary logic 

Weak second-order logic attempts to absorb the notion of finite into the 
semantics (meaning) of the logic. It has turned out to give a more elegant 
theory, however, to absorb it into syntax of the logic by allowing infinite 
formulas, like 


Vx [x =Ov2x =0v--:]. 


The logic L.,,.. allows the additional formation rule: if ®is a countable set 
of formulas then A @ (the conjunction of ®) and V @ (the disjunction of 
®) are formulas. This logic is discussed in several of the chapters in this 
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part of the book. The notation L.,.. is explained by the fact that countable 
(< @,) conjunctions and disjunctions are permitted but only finite (< w) 
strings of quantifiers. One can think of the logic L.,,.. as expressing those 
notions which are first-order modulo a countable amount of information. 
The Lowenheim-Skolem Theorem holds for L.,,., but not the Compactness 
Theorem. To get a Completeness Theorem, for countable theories T of 
L.,., one must add an infinitary rule of proof. 

It is easy to translate weak second-order logic into L.,., but not 
vice-versa. In particular, in weak second-order logic there are relations 
which are implicitly, but not explicity definable, something that cannot 
happen in first-order logic or L.,., by Beth’s Theorem. If one looks for the 
smallest logic containing weak second-order logic in which all implicitly 
definable relations are explicitly definable, then one is Jed to the study of 
admissible fragments L, of L.,., as studied in Chapter A.7. 


5.5. Logic with new quantifiers 

It is easy to see that all finitary propositional connectives are definable 
from the ones 1n first order logic, in fact from v and —. Mostowski long 
ago raised the possibility of adding new quantifiers to first-order logic. 
Thus, let Q be a new symbol and allow the formation rule: if g(x) is a 
formula, so is Qx p(x). There are many possible interpretations of Q. For 
example, we could define 2tF Qx ¢(x) iff there are infinitely many x such 
that WE p(x). This logic, called L(Qo), is essentially equivalent to w-logic 
and weak second-order logic. 

If we define Btk Qx v(x) iff there are uncountable many x such that 
Dee ~(x) then we obtain logic with the quantifier ‘‘there exists uncountably 
many’’. This logic, unlike all the others mentioned earlier, has a Complete- 
ness Theorem and Compactness Theorem entirely analogous to that of 
ordinary first-order logic, (as long as the set L of symbols is finite or 
countable). Few people would claim that the notion of uncountable is a 
logical, rather than mathematical, notion, but the Completeness Theorem 
of KEIsLer [1970] for this logic does give one pause. The notions of ‘“‘many” 
and ‘“‘most”’ seem almost logical and various precise mathematical notions 
like ‘“‘measure 1”’, ‘‘second category”, “infinitely many”’, and ‘‘uncountably 
many” use the intuitive notions for motivation. The Completeness 
Theorem of Keisler for ‘there exists uncountably many” shows that this 
notion provides a mathematicallly precise model for one informal concept 
of ‘‘many”’. Written out in English, using “‘many” and “‘few”’ for “‘uncount- 
able” and “‘not uncountable” respectively, Keisler’s basic axioms are: 

(1) For all y there are few x such that x = y. 
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(2) If e(x)— W(x) for all x, and if many x satisfy ¢ then many x satisfy 
wb. 

(3) If many x satisfy (p(x) v w(x)), then either many x satisfy g or many 
x satisfy w. 

(4) If there are only a few x for which Jy ¢(x, y), and if for each x there 
are only a few y such that (x, y), then there are only a few y for which 
Ix p(x y). 

Notice that “‘there are many x”’ is not a consequence of the axioms since 
the axioms hold in all structures, finite, countable, or uncountable. 
Sometimes, late at night, one can almost imagine some other world where 
such axioms are considered laws of thought in the same way that we accept 
the laws of first-order logic. But now we are entering the realm of science 
fiction, or mathematical fiction, so we had better stop. 


5.6. Abstract model theory 

Recent years have witnessed the foundation of a new branch of model 
theory. Abstract model theory steps back and surveys the whole spectrum 
of logics and the relationships between them. A logic consists of a syntax 
and a semantics which fit together nicely, in the sense that elementary 
syntactic operations (like renaming symbols) are performable and have 
their desired meaning. Glancing at the above examples should give one a 
feeling for a more precise definition (see BARwisE [1974]). 

Of the above examples, weak second-order logic, L.,. and L(Qs) all 
satisfy the Lowenheim-Skolem Theorem (Theorem 2.5 with « = No) but 
not the Compactness Theorem. On the other hand, L(Q), where Q means 
uncountable, satisfies the Compactness Theorem but not the 
Léwenheim-Skolem Theorem. This is explained by one of the first, and 
still most striking, results of abstract model theory. A proof can be found in 
BarwisE [1974] where other references are also given. 


5.7. THEOREM (Lindstr6m). First-order logic is the only logic closed under 
A, —, 3 which satisfies the Compactness Theorem. and the 


’ 


Léwenheim—Skolem Theorem. 
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1. Introduction 


Model theory may be described as the union of logic and universal 
algebra. The leading characters are the sentences ¢ and the structures Dt 
for a language L. As we shall see in this chapter, classical examples in 
algebra have led to many notions in model theory. The use of sentences as 
mathematical objects is a powerful tool which has led to unexpected 
applications outside of model theory. 

For example, 300 years ago Leibniz conjectured that calculus could be 
developed rigorously by extending the real number system R to a structure 
*R such that *R contains infinitesimals but every statement true of R is true 
of *R. There was no chance of solving this problem until model theory 
developed to the point where the statements could be defined precisely and 
treated as mathematical objects. Abraham Robinson solved the problem in 
1960, taking the statements to be the formulas of first order logic. In the 
simplest formulation one starts with the structure {it which has the set of 
reals R for its universe, and a symbol for each real relation and function. 
The first step is to form a proper elementary extension *9 of H, that is, a 
proper extension which satisfies the same first order formulas as 3. This 
elementary extension is obtained easily using general results in model 
theory, either the compactness theorem or the Kos ultraproduct theorem. 
The surprise was the next step; Robinson showed that the early intuitive 
arguments with infinitesimals could be rigorously carried out in *R, and 
new results quickly followed. Robinson’s infinitesimal analysis is presented 
in Chapter A.6. Other applications, mostly to algebra, are presented in other 
chapters. 

We find two historical threads in the development of model theory. In 
North America these are often called western and eastern model theory, 
because Tarski has lived on the west coast since the 1940’s, and Robinson 
was on the east coast from 1967 to his premature death in 1975. The 
distinction no longer has anything to do with geography, but it is still 
mathematically useful. 

Western model theory is in the tradition of Skolem and Tarski. It is largely 
motivated by problems in number theory, analysis, and set theory. It 
emphasizes the set of all formulas of first order logic. 

Eastern model theory 1s in the tradition of Malcev and Robinson. It is 
motivated by problems in abstract algebra, where theories usually have at 
most two blocks of quantifiers. It emphasizes the set of quantifier-free 
formulas and the set of existential formulas. 

Many model theorists move back and forth between western and eastern 
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model theory. In fact, Tarski and Robinson made major contributions to 
both areas (Tarski’s work on real closed fields and on equational classes are 
eastern model theory, while Robinson’s consistency theorem and his 
infinitesimal analysis are western model theory). 

In this chapter we shall present some of the methods of both western and 
eastern model theory. Many notions and results have two distinct versions, 
a western version dealing with arbitrary formulas, and an eastern version 
dealing with quantifier-free formulas. We shall often use the prefix “basic” 
for the eastern version to distinguish it from the western version. 

The deeper proofs in model theory usually depend on the construction of 
a model with certain properties. The constructions almost always use one 
or more methods from the following list. 

Elementary chains, 

diagrams and other expansions of the language, 

compactness theorem, 

downward Léwenheim-Skolem theorem, 

omitting types theorem, 

forcing, 

ultraproducts, 

homogeneous sets. 

We shall explain all but the last two in this chapter. For ultraproducts and 
homogeneous sets, see Chapters A.3 and A.5S. 

This chapter will concentrate on the simplest nontrivial language, first 
order logic. For some applications, many-sorted logic (logic with many 
sorts of variables) is more natural. Our presentation can be routinely 
extended to many-sorted logic but the notation would be more compli- 
cated. In the last section we indicate how our methods extend to the 
stronger logics where model theory has been successful. 

We assume that the reader knows the material in Chapter A.1, especially 
the important definition of satisfaction of a formula in a structure, 
Me y[s]. A useful collection of survey articles on model theory is in the 
book Mor ey [1973]. For further study in model theory we suggest the 
books Cuane and KEIsLeR [1973], and Sacks [1972]. 


2. Theories 


In this section we present some of the fundamental notions of model 
theory. In particular we shall discuss various ways of classifying theories in 
first order logic. We shall build upon Chapter A.1. 
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We consider a first order language L. By a theory T in L we mean a set of 
sentences of L. The notion Yt T, read Mt is a model of T, means that WP is 
a structure for L such that every g € T holds in Mt. Two theories T, and T, 
are said to be equivalent if they have exactly the same models. T is 
consistent, or satisfiable, if it has at least one model. There are two ways in 
which theories commonly arise. 

First, consider a class K of structures for L. The theory of K is the set 
Th(K) of all sentences ¢ of L such that @ holds in every PE K. For 
example, if K is the class of finite groups then Th(K), the theory of finite 
groups, is the set of all sentences true in all finite groups. 

K is said to be an elementary class, or an EC, class, if K is the class of all 
models of some theory T (hence of Th(K)). The class of all groups is an 
elementary class. However Proposition 4.6 in this chapter shows that the 
class of finite groups is not an elementary class. That is, there are infinite 
groups Yt which satisfy every sentence true of all finite groups. To 
characterize the notion of a finite group one must go beyond first order 
logic. Other examples of non-elementary classes are in Chapter A.1. 

Given a structure 2, the theory of Mt, Th({Pt}) or Th(M0), is the set of all 
sentences true in QV. 


2.1. DeFinition. A theory T is complete if T is equivalent to Th(2) for 
some structure WM. 


Two structures It and 9 are elementarily equivalent, Mt = MN, if Th(Mt) = 
Th(M). Thus M2=R means that Ml and M satisfy exactly the same 
sentences. 

The second way in which theories commonly arise is as simple sets of 
sentences. A set of sentences which is equivalent to Th(K) is called a set of 
axioms for K. Examples are the theories of groups, abelian groups, 
divisible groups, and torsion-free groups as presented in Chapter A.1. The 
first two examples are finitely axiomatizable, while the last two are 
recursively but not finitely axiomatizable. A language L is said to be 
recursive if L is finite or countable with a recursive set of symbols. 


2.2. DEFINITION. A theory T is finitely axtomatizable if it is equivalent toa 
finite set of sentences. A theory T is recursively axiomatizable if it is 
equivalent to a recursive set of sentences in a recursive language. 


The compactness theorem can often be used to show that a theory is not 
finitely axiomatizable; see Chapters A.1 and A.3. Of course, finite ax- 
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iomatizability implies recursive axiomatizability. It is perhaps helpful to list 
some additional examples. 


2.3. ExampLe. The following theories are finitely axiomatizable: 
Groups, 
abelian groups, 
rings, 
integral domains, 
fields, 
fields of characteristic p (p# 0, fixed), 
ordered fields, 
linear order, 
lattices, 
Boolean algebras, 
Bernays—Gdédel set theory, 
the inconsistent theory. 


2.4. Examp.e. The following theories are recursively but not finitely 
axiomatizable: 

Divisible groups, 

torsion-free groups, 

fields of characteristic zero, 

algebraically closed fields, 

real closed fields, 

finite fields (Ax [1968]), 

Zermelo—Fraenkel set theory, 

Peano arithmetic. 
It turns out that the theory of finite groups is not even recursively 
axiomatizable (CoBHAM [1962]). 


Here is a conventient characterization of recursively axiomatizable 
theories. A sentence ¢ is a consequence of T, TF g, if every model of T is 
a model of ¢. 


2.5. THEOREM. A theory T in a recursive language is recursively axiomatiz - 
able if and only if the set of all consequences of T is recursively enumerable. 


Usually the set of consequences of a theory T is not recursive. Church’s 
Theorem shows that if L has at least one binary relation or function 
symbol, the set of consequences of the empty theory (i.e., the valid 
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sentences) is not recursive. The rare theories which do have recursive sets 
of consequences are of considerable interest. 


2.6. DEFINITION. A theory T in a recursive language is decidable if the set 
of all consequences of T is recursive. 


Intuitively this means that there is an algorithm for deciding whether an 
arbitrary ~ is a consequence of T. Here is a simple but useful sufficient 
condition for decidability which is proved in 7.2 of Chapter C.1. 


2.7. THEOREM. Every complete recursively axiomatizable theory is 
decidable. 


Model theory has developed some powerful techniques for showing that 
a theory is complete; when the theory in question is a recursive set of 
sentences, we can conclude that the theory is decidable. For example, in a 
classical paper, Tarski and McKinsey [1948] showed that the theory of real 
closed fields is complete and hence decidable. The problem can also be 
viewed as follows. Given a structure 2 (such as the field of reals), 
determine whether Th(2) is decidable, and if so, find a recursive set of 
axioms. For more about decidable and undecidable theories see Chapters 
A.4 and C.3. 

We now turn to some notions which are motivated by algebraic 
phenomena. 


2.8. DEFINITION. A universal formula is a formula of the form 
Vx, ica Vane 


where g has no quantifiers. A universal theory is a theory which is 
equivalent to a set of universal sentences. 

N is a substructure of M,N CM, if the universe of M is a subset of the 
universe of I? and the interpretation of each relation, function, or constant 
symbol in ¥t is the restriction of the corresponding interpretation in QM. 
Equivalently, for every atomic formula g and assignment s in 9, 


NeE-p[s] iff Me els]. 


Given a nonempty subset X C M, there is a least substructure 9 C Me 
containing X, called the substructure generated by X. A finitely generated 
substructure of Pt is a substructure generated by a finite X C M. 


2.9. Proposition. If T is a universal theory, then every substructure of a 
model of T is a model of T. 
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In the next section we shall prove the converse of this fact, the 
Vos-Tarski Theorem. 

For example, the theory of groups formulated in the language {+,0, —} 
is universal, and every substructure of a group is a group. However, when 
group theory is formulated in the language {+,0}, it is no longer a 
universal theory because an existential quantifier is needed to assert the 
existence of an inverse, 


Vx dy(xt+y=Ony+x =0). 


When formulated in the language {+ ,0}, a substructure of a group is a 
semigroup with unit but not necessarily a group. 

Some other examples of universal theories are the theories of abelian 
groups, rings, integral domains, lattices, linear order, Boolean algebras, 
and torsion-free groups. , 

The western analogue of a substructure is an elementary substructure 
(TARsKi and VAUGHT [1957]). 2 is an elementary substructure of M, N <M, 
if 9} C M and for every formula ¢y and assignment s in JN, 


NE ols] iff Me els]. 
We also call Mt an elementary extension of R and write Mt > N. Obviously, 
N<M implies N=M. 
It is often more convenient to work with embeddings instead of 


substructures. An isomorphic embedding f : St —> Mt is a mapping of N into 
M such that for every atomic formula ¢ and assignment s in %, 


NE p(s] iff Me e[sof]. 


f must be one-to-one because x = y is atomic. Similarly an elementary 
embedding f:St—.M% is a mapping such that for every formula g and 
assignment s in J, 
NE pls] iff Mer g[s]. 
An isomorphic embedding f : 3{—> Mt is called an isomorphism if M is 
the range of f, in symbols f : Jt = Mt. This is the usual notion in algebra, and 


F:N=M>D fs N>MDN=aM. 


Although the notion of an elementary extension is very strong, there is 
an abundance of examples in model theory. The following criterion is 
useful. 


2.10. THEOREM. J{< Mi if and only if NGM and for every formula 
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w(x, yi,-.-, yn) and aEM, b,...,b,E N, if Me wWla, bi,...,b,], then 
there exists b © N such that Dik [b, bi,..., d,]. 


Notice that the condition involves only satisfaction in the larger structure 
Mt. To prove the theorem, one shows by an easy induction on this length of 
y that for every assignment s in N, 


NE pls} iff Me els]. 


Here are some examples using Theorem 2.10. 


2.11. Example. Let F be a field and let X, Y be infinite sets of variables 
with X C Y. Then F[X] < F[Y] where F[X] is the ring of polynomials in 
X over F, and F(X)<F(Y) where F(X) is the pure transcendental 
extension of F by X. If G(X) is the group freely generated by X, then 
G(X)< G(Y). 


It is an open problem of Tarski whether G(X)<G(Y), or even 
G(X)=G(Y), when X is a finite set of two or more generators. 

The notion of a model-complete theory provides additional examples of 
elementary extensions. 


2.12. DEFINITION. A theory T is said to be model-complete if whenever 
WM, MN are models of T and MC M, Mt is an elementary substructure of J. 
Equivalently, every isomorphic embedding between models of T is an 
elementary embedding. 


This notion is due to Rosinson [1956a]. It is a central idea in eastern 
model theory and is the topic of Chapter A.4. 


2.13. ExampLe. The following theories are model-complete (see Chapters 
A.3 and A.4). Algebraically closed fields, real closed ordered fields, 
atomless Boolean algebras, dense linear order. Thus, for example, the 
ordered field of real numbers is an elementary extension of the ordered 
field of real algebraic numbers. 


Each of the above examples has the following property which is even 
stronger than model-completeness. Given a formula ¢(x,...,x,), the 


notation 
TE o(x1,..., Xn) 


means that the sentence Vx,---Vx,@ is a consequence of T. 
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2.14. Derinition. A theory T admits elimination of quantifiers if every 
formula 9(x,,...,%,) of L is T-equivalent to a quantifier-free formula 
w(x1,...,%.) of L, that is, 


TE o(X1,..., Xn) A W(X1,.. +5 Xn)- 


It is easy to see that every T which admits elimination of quantifiers is 
model-complete. In fact, for T to be model-complete it is sufficient that 
every formula of L is T-equivalent to a universal formula of L. We shall see 
in the next section that this condition is also necessary. An example of a 
theory which is model-complete but does not admit elimination of quantifi- 
ers is the theory of real closed fields. The order relation can be defined 
from the field operations but a quantifier is needed. 

Eastern model theory concentrates on inductive theories, defined below. 


2.15. DeFIniTION. A theory T said to be inductive if T is equivalent to a 
set of VJ sentences, that is, sentences of the form 


Vx Vx, dy. dy. 


where w& is quantifier-free. 

Inductive theories are also called WA theories. Most theories arising in 
algebra are inductive. All the theories listed in Examples. 2.3 and 2.4 are 
inductive except finite fields, Bernays-Gédel set theory, Zermelo- 
Fraenkel set theory, and Peano arithmetic. The next result is the reason for 
the use of the name “‘inductive’’. 


2.16. Proposition. If T is inductive, then the union of any increasing chain 
Mi CM, CM. C <o, 
of models of T is a model of T. 


The proof of Proposition 2.16 is easy. The converse is deeper and will be 
proved in the next section. 
The western analogue of a chain of structures is an elementary chain 


Mo <M, <M, <--- 


Elementary chains are an extremely important construction in model 
theory, and several applications are given later'in this chapter. Their 
importance is based on the following fundamental result of Tarski and 
Vaucut [1957]. 
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2.17, ELEMENTARY CHAIN THEOREM. The union It of an elementary chain 
Mo<M,<DMi<-:- 


is an elementary extension of each structure Yt, in the chain. 


Proor. Show by induction on the length of formulas ¢ that for each n and 
each assignment s in W,, 


MF o[s] iff ME gels]. O 


It follows that for any theory T, the union of an elementary chain of 
models of T is a model of T. Proposition 2.16 and Theorem 2.17 hold not 
only for countable chains of structures, but also for uncountable chains and 
even for arbitrary upward directed systems of structures. 

There are two ways to reduce western model theory to eastern model 
theory by adding extra symbols. By a conservative extension of a theory T 
in L we mean a theory T’D T in an expanded language L’D L such that 
every model Yt of T has an expansion to a model WM’ of T’. 


2.18. THEOREM. Every theory T has a conservative extension T' which is 
inductive and admits elimination of quantifiers. 


Proor. Write x for x,,...,x, For each formula g(x) of L, add a new 
relation symbol R, (x) to L’, called a Skolem relation for g. T' is the union 
of T and 


{Vx (o(x) > R, (x)): ¢ EL}. 


T’ is obviously a conservative extension of T. By eliminating Skolem 
relations, every formula ¢‘(x) of L’ is T’-equivalent to a formula ¢ (x) of L 
and hence to R, (x). Therefore T’ admits elimination of quantifiers. The 
equivalent V4 axioms for T' are the sentences R,, # € T, and the sentences 


of the form 
Vx (R-, (x) < TR, (x)), 
Vx (Rony (4) Ry (x) 4 Ry (x), 
Vx (Ray, (x) AyR, (x, y)). O 


2.19. THEOREM. Every theory T has a conservative extension T' which is 
universal and model-complete. 
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Proor. For each formula of the form Jy (x, y) in L add a new function 
symbol F3,, (x), called a Skolem function; call the new language L,. Addto 
T the new axioms 


Vx (Ay ¢ (x, y) g(x, Fy (x)), 


forming the theory T;. Iterate the process countably many times, and let 
L’= U,L,, T’= U,.T,. Then T’ is a conservative extension of T. Using 
the axioms one can replace existential quantifiers by Skolem functions, so 
that every formula ¢ of L’ is T’-equivalent to a universal formula ¢', hence 
T’ is model-complete. The set of universal sentences equivalent to T’ 
contains g’ where ¢ € T, and the sentences 


Vx Vy (h(x, y)—> U(x, Faye (x) 


where w& is quantifier-free in L’. O 


The theory T’ in 2.18 is sometimes called the Morley expansion of T, 
while the T’ in 2.19 is called the (iterated) Skolem expansion of T. The 
Morley expansion required only a single step because every sentence of L’ 
_ is T’-equivalent to a sentence of L. On the other hand in the Skolem 
expansion we kept getting new formulas in L,., and needed to iterate the 
construction countably many times. Skolem expansions play a key role in 
Chapter A.5. 


3. Diagrams and compactness 


The method of diagrams was invented by Henkin and A. Robinson 
around 1950. They used the method to give new proofs of the Gédel 
completeness theorem and obtained several other applications. A diagram 
of a structure Jt is analogous to the multiplication table of a group; it is a 
set of sentences involving new constant symbols for the elements of De. 
Diagrams are useful because they give a way of constructing models from 
sets of sentences. There are two kinds of diagrams, corresponding to 
eastern model theory and western model theory. 


3.1. DEFINITION. Let Yt be a structure for a first order logic L. The diagram 
language of XM is the expansion Ly of L formed by adding a new constant 
symbol c,, for each element m € M. The diagram expansion of Mt is the 
structure Dt, for Ly such that each c,, is interpreted by m. The (basic) 
diagram of Dt is the set D(t) of all atomic and negated atomic sentences 


58 KEISLER/FUNDAMENTALS OF MODEL THEORY [cH. A.2, §3 


of Ly which are true in Diy. The elementary diagram of M is the complete 
theory Th(2.,), that is, the set of all sentences of Ly true in Diu. 


More generally, given a subset X C M, the expansion Lx and structure 
Mx are defined in the natural way. If f maps X into M, then Wt,x is the 
structure for Lx in which each c, is interpreted by f(x). 

There is a simple but useful relationship between diagrams and embed- 
dings. 


3.2. PRoposiTION. (i) f is an isomorphic embedding of I into N if and only 
if the expansion ty is a model of the diagram of Mt. Briefly, 


f:MoN iff Nw HK DM). 


(ii) f is an elementary embedding of Yt into N if and only if Nya is a 
model of the elementary diagram of Ut, 


f:Mo MN iff Noe HE Tha). 


It follows that T is model-complete if and only if for every model Dt of 
T, TU D(®) is complete. This is the formulation which suggested the 
name model-complete to Robinson. 

In model theory, the most important consequence of the completeness 
theorem is the compactness theorem. The completeness and compactness 
theorems are proved in a leisurely manner in the chapter by Barwise. To © 
illustrate the use of diagrams we shall give a second direct proof of the 
compactness theorem here. Still another proof, using ultraproducts, is 
given in Chapter A.3. Our proof is based on the lemma below. 

A theory T is said to be finitely satisfiable if every finite subset of T has a 
model. 


3.3. LemMMA (HENKiN [1949]). Let M be a nonempty set, L a first order 
language, and T a theory in the diagram language Lm. T is the elementary 
diagram of some structure Yt with universe M if and only if: 
(i) T is finitely satisfiable. 
(ii) For each sentence ~ of Lm, either p © T or (—1~)€ T. 
(iii) If 3x e(x)E T, then ¢(c,.)€ T for some m € M. 
(iv) If m,n © M and m# n, then (—1¢m = Cn) € T. 


Proor. It is obvious that every elementary diagram of a model with 
universe M has properties (i}-(iv). If T has properties (i}(iv), there is a 
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unique model Yt whose basic diagram is included in T. Then it follows by 
induction on the length of g that each sentence ¢ of Ly holds in Mt if and 
only if it belongs to T. O 


3.4. COMPACTNESS THEOREM. Let Ty be a set of sentences of L. If To is 
finitely satisfiable, then T) has a model. 


Proor. Let L have x sentences and let C be a set of « new constant 
symbols. « must be infinite, so the expanded language Lc also has x 
sentences, say ¢,,@ an ordinal <x. We may form an increasing chain 
T., a <«, of sets of sentences of Lc such that: 

(1) Fewer than « constant symbols from C occur in T,. 

(2) T.. is finitely satisfiable. 

(3) If T. U{ga} is finitely satisfiable, then g, © Ty4;. 

(4) If g. © T.., and ¢, is of the form 4x (x), then &(c) € T,.; for some 
cEC. 

Let T = U,-: T,. T. has properties (i)}-(iii) of Lemma 3.3. The relation 


{(c, d): (c = d)€ T,} 


is an equivalence relation on the set C. Choose a set M CC containing 
exactly one element of each equivalence class and let T be the set of all 
sentences of Ly belonging to T,. Then T has properties (i}(iv) of the 
lemma, so T is the elementary diagram of some model Yt. Since TC T, 
we have 7)C T, hence Jv is a model of T). O 


The compactness theorem is sometimes called the local theorem or the 
finiteness theorem. It is more often called the compactness theorem 
because it says that a certain topological space is compact. Topological 
notation is often found to be convenient in model theory, especially in the 
area known as stability theory (Section 7). 

Given a first order language L, let S be the set of all complete theories 
Th(M) in L. We call the elements p = Th(M) € S points of S. The Stone 
space of Lis the set S with the following topology. For each sentence ¢ of 
L, let 


[p]={p ES:  € p}, 


and let S be the topology whose basic closed sets are the sets of the form 
[¢]. These sets from a closed basis because 


lev ¥] =[¢] UY). 
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In fact, the sets [p] are also open (hence clopen), because 
[-¢]=S-[¢]. 


The closed sets of S are exactly the intersections of basic closed sets, and 
hence are the sets of the form 


S(T) = A [y]={pES: TCp} 


where T is a theory in L. S is a totally disconnected Hausdorff space, 
because if p# q then there is a sentence g € p — q, hence 


PEl[¢], a€[e], [¢] clopen. 
A theory T is satisfiable just in case 


ial [o] ¥ @. 


Thus the compactness theorem states that if every finite subset of a set of 
basic closed sets {[¢]: ¢ € T} has a nonempty intersection, then the whole 
set has a nonempty intersection. In other words, the compactness theorem 
states that the Stone space S of L is compact. 

We shall now give several applications of the compactness theorem. 
More applications to algebra and analysis can be found in Chapters A.1, 
A.3, A.4 and A.6. 


3.5. THEOREM. The amalgamation property holds for elementary embed- 
dings. That is, if f : Di. and g : Wi—>.B, then there is a structure O and 
a commutative diagram 


N 
ES 


vNe D 

es ; 

s L 

a7 

Proor. Let Tx and Tx be the elementary diagrams of % and 8 with 
distinct constant symbols c,,n € N and d,, p € P. Let T be the theory 


T= Ta U Tx U {Cp¢my = g(m)- Mm € M} 


in the language (Lx)s. Any finite subset of T is satisfied by an expansion of 
M to (Lx)g in which the constants Cyim), Agim), m™ & M, are interpreted by m. 
Therefore by compactness, T has a model ©’. Let © be the reduct of ©’ to 
L, and let h(n), k(p) be the interpretations of c,, d, in ©’. Then h :MN—, 
k :B— 9, and the diagram commutes. By 3.2, h and k are elementary 
embeddings. O 
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The amalgamation property for isomorphic embeddings is important in 
eastern model theory. 


3.6. Derinition. A class K of structures for L has the amalgamation 
property if whenever M,N, BPE K and f: MN, g:M— FP, there is a 
structure © € K and a commutative diagram 


‘ ar S 
M D 
~\ 
B 
The following result has a proof like that of Theorem 3.5. 


3.7. THEOREM. If T is a theory which admits elimination of quantifiers, then 
the class K of all substructures of models of T has the amalgamation 


property. 


We shall see later (3.11) that the class K is itself the class of models of a 
theory. When T and K are as in 3.7, we call T the model completion of K. 
Table 1 shows a short list of examples of elementary classes K with the 
amalgamation property. These matters are discussed further in Chapter 
A.4, where some exciting newer examples are discussed. 


TABLE 1 
K T = Model completion of K 
Fields Algebraically closed fields 
Ordered fields Real closed ordered fields 
Boolean algebras Atomless Boolean algebras 
Abelian groups Algebraically closed abelian groups 
Groups Does not exist 


The next application was given by Malcev (see MALcEv [1971]) at a very 
early stage and was instrumental in the development of model theory. A 
more general treatment using ultraproducts is in Chapter A.3. 


3.8. Derinition. A class K of structures for L is said to have finite 
character if for every Pt, ME K if and only if every finitely generated 
MN CMe belongs to K. 
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3.9. THEOREM. Suppose K is the class of all substructures of L-reducts of a 
theory T’ in a language L'D L. Then K has finite character. 


Proor. Yte€ K if and only if T’U D(M) has a model. If every finitely 
generated Jt C M belongs to K, then T’U D(M) is finitely satisfiable, and 
by compactness T’ U D(2t) has a model. OF 


Malcev gave several applications of the above theorem to group theory. 
Here is a sample. 


3.10. ExampLe. The following classes of groups have finite character by 
Theorem 3.9. 

(1) The class of groups with a solvable normal series of length <k (k a 
fixed positive integer). 

(2) The class of groups with a normal subgroup of index <k. 

(3) The class of orderable groups, i.e. reducts of ordered groups. 

(4) The class of all groups 2 such that for some field %, Dt is represent- 
able as a group of n Xn matrices over %. 


The next theorem is the simplest example of a collection of results in 
model theory called preservation theorems. T is preserved under substruc - 
tures if whenever Pt T, every substructure XCM is a model of T. 


3.11. Kos-Tarsk! THEOREM. A theory T is preserved under substructures if 
and only if T is a universal theory. 


Proor. We shall prove a bit more. Let K be the class of all substructures of 
models of T and let Ty be the set of all universal consequences of T. We 
prove that K is the class of all models of Ty. Obviously Yi K implies 
Me Ty. Let Pt be a model of Ty. It follows that D(M) U T is finitely 
satisfiable. By compactness D(X) U T has a model 9. Mt is isomorphically 
embeddable in % by 3.2, so MEK. O 


By a similar argument one can prove the quantifier elimination charac- 
terization of model completeness (RoBINSON [1956a]). 


3.12. THEOREM. A theory T is model-complete if and only if every formula 
y(x1,...,Xn) is T-equivalent to a universal formula W(x,,...,Xn)- 


We conclude with another preservation theorem. 
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3.13. CHANG—Lo$-Suszko THEOREM. A theory T is preserved under unions 
of chains if and only if T is inductive, that is, equivalent to a set of WA 
sentences. 


Proor. For the nontrivial direction, suppose T is preserved under unions 
of chains. Let Tvs be the set of WA consequences of T. Consider an 
arbitrary model IM of Tvs. It follows that D(D) U T is consistent, where 
D,(M) is the set of all universal sentences of Ly true in Dy. By 
compactness, D\(Yt)UT has a model Muy with MCN. Then every 
existential sentence of Ly true in 3m holds in ty. Hence the diagram 
D(Ru) of Mu is consistent with Th(M). By compactness again, 3 has an 
extension Yt, which is an elementary extension of Dt. Repeating the 
construction countably many times we get a chain 


MCRCM,CR,C--- 


where each ¥t; is a model of T and the Y,’s form an elementary chain. The 
union of this chain is both a model of T and an elementary extension of Dt. 
Thus Jt is a model of T, and T is equivalent to Tvs. O 


The method used in the above proof occurs frequently in model theory 
and is called the method of alternating chains. 


3.14. CoROLLARY. Every model-complete theory is inductive, i.e. is equiva - 
lent to a set of WA sentences. 


Proor. By the elementary chain theorem and 3.13. O 


4. Lowenheim-Skolem Theorems 


In Chapter A.1 the L6wenheim-Skolem Theorem was proved in the 
following form. 


4.1. LOWENHEIM-SKOLEM THEOREM. If T has an infinite model, then T has 
models of every infinite cardinal x greater than or equal to the cardinal of T. 


We shall prove two important stronger forms of this theorem due to 
Tarski and VAuGuT [1957]. In this section we always let « be an infinite 
cardinal greater than or equal to the number of symbols in L. By the 
cardinal of X% we mean the cardinal |M| of the universe of 2. 
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4.2. DOowNWARD LOWENHEIM-SKOLEM THEOREM. Suppose XCM _ and 
|X|<« <|M|. Then M has an elementary submodel N <M of cardinal x 
such that X CN. 


Proor. The proof is a simple construction and does not use the compact- 
ness theorem. Form a countable increasing chain X,»C X,C X.C::: of 
subsets of M such that 

(1) X € Xo; 

(2) X, has cardinal «; 

(3) for each sentence 3x (x) of Lx, which holds in Mx, there is an 
element of X,., which satisfies ¢ (x). 
Then the substructure % of M with universe N = U,,X, has the desired 
properties. O 


The Downward Léwenheim-Skolem Theorem has some applications 
which go beyond first order logic. 


4.3. Exampce. If a theory T in the language of ordered field theory has an 
Archimedean model, then it has a countable Archimedean model. 


This is because every substructure of an Archimedean field is again 
Archimedean. 


4.4, ExampLe. An € -model of set theory is a model of the form (M, €). 
Let T bea theory in the language L = { € }. If T has an infinite € -model, it 
has a countable € -model. 


4.5. Examp_e. Let t= (M, <,...) be a structure such that (M, <) has 
order type w, and the language L is countable. Then St has a countable 
elementary submodel ® such that (N, <) is an initial segment of (M, <). 


Proor. Using 4.2 countably many times we obtain an elementary chain 
No < Ni <--- of countable elementary submodels of Mt such that M,., 
contains the initial segment of M generated by 9t,. The desired structure is 
the union N= U,N, O 


Here is a finite analogue of the L6wenheim-Skolem Theorem. 


4.6. Proposition. If a theory T has arbitrarily large finite models, then T has 
an infinite model. 
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Proor. The theory 
T= TU{Wx,-+ Wan dy (yA xian---AyAx,): n=1,2,...} 


is finitely satisfiable and hence has a model. Obviously every model of T’ is 
infinite. O 


We now come to another result of Tarski and Vaught, which is proved by 
the compactness theorem. 


4.7. UpwARD LOWENHEIM-SKOLEM THEOREM. Let Wt be an infinite structure 
for L. For every cardinal x greater than or equal to the cardinal of Wt and the 
cardinal of L, Dt has an elementary extension of cardinal x. 


Proor. Let C be a set of « new constant symbols, and form the theory T 
in (Lu )c given by 
T = Th(Mu)U {Ac = d: c,dEC, c# d}. 


T is finitely satisfiable and hence has a model St’. We may assume without 
loss of generality that for each m E M, c,, is interpreted by m in N’. Then 
by 3.2, the reduct 9 of N’ to L is an elementary extension of Yt. Moreover, 
¥ has cardinal at least x. By the Downward Léwenheim-Skolem Theorem 
there is a 8 of cardinal x such that M<B<MN. OF 


4.8. REMARK. The same method also shows that every infinite structure 
of cardinal x has a proper elementary extension of cardinal x. 


The theorem is sometimes applied to an expansion of the original 
language L. 


4.9, ExAMpLE. A homogeneous linear ordering is a linear ordering (M, <) 
such that any order-preserving function from a finite subset of M into M 
can be extended to an automorphism of (M, <). For every infinite « there 
is a homogeneous dense linear ordering of cardinal «x. 


Proor. The ordering of the rationals, (Q, <), is homogeneous. Thus for 
each n there is a (2n + 1)-placed function f,(x, y, z) such that if y and z are 
increasing then f,(-, y, z) is an automorphism of (Q, <) mapping y onto z. 
If (M, <,...) is an elementary extension of (Q, <, fo, fi,...) of cardinal x, 
then (M, <) is homogeneous. Of course, one can construct examples 
directly but they are rather complicated. O 
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The Léwenheim-Skolem Theorem is often used in proofs that theories 
are complete. The simplest completeness test is the Kos—Vaught test. 


4.10. Derinition. A theory T is said to be «-categorical if, up to isomor- 
phism, T has exactly one model of cardinal x. 


4.11. Kos-VauGut test. If T no finite models and T is x-categorical for 
some x, then T is complete. 


Proor. If T has two distinct complete extensions, then T must have two 
nonisomorphic models of cardinal x. O 


4.12. Examp.e. Examples of w-categorical theories are: 
(i) Dense linear order without endpoints. 
(ii) Equivalence relations with infinitely many classes and all classes 
infinite. 
(iii) Atomless Boolean algebras. 
(iv) Abelian groups with all elements of order p. 


4.13. ExampLe. Examples of w,-categorical theories are: 
(i) Algebraically closed fields of given characteristic. 
(ii) Uniquely divisible torsion-free abelian groups. 
(iii) The theory of structures (M, f) where f is a permutation of M with 
no finite cycles. 
(iv) Abelian groups with all elements of order p. 


Each of the above examples is complete (by the Zos—Vaught test) and 
recursively axiomatizable, and hence decidable. Theories which are 
« -categorical are very rare. We shall meet them again in Sections 7 and 8. 
The following more general completeness test can be used more often than 
the Los-Vaught test. 


4.14, CompLeTENEss TEST. A countable theory T is complete if and only if T 
is consistent and any two finite or countable models of T have isomorphic 


elementary extensions. 


Proor. The necessity follows by compactness and the sufficiency follows 
from the Downward Léwenheim-Skolem Theorem. O 


4.15. Exampte. Let Tbe the theory of structures (M, f) where f: M—>M 
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is a permutation of M. The complete extensions of T are exactly the 
theories T(s) described as follows. For each countable sequence s = 
(51, 52, S3,...) of elements of w U {~}, let T(s) be the theory of all models of 
T such that f has s, cycles of length n, interpreting ~ as infinitely many. 


Case 1. %s, is finite. Then T(s) has a unique model which is finite. 


Case 2. Xs, = © but only finitely many s, are nonzero. Then T(s) has no 
finite models and is w-categorical, hence complete. 


Case 3. Infinitely many s, are nonzero. Let Yt be a countable model of 
T(s). By compactness, Yt has a countable elementary extension Jv with 
infinitely many infinite cycles. Up to isomorphism, T(s) has only one such 
model ¥t, so T(s) is complete. 


4.16. Examp_e. Let T be the theory of linear orderings in which each 
element has a unique successor and predecessor. The integers with order 
form a model of T. This theory T is complete. 


We show T is complete using the method of alternating chains. Let Dt 
and 3% be two countable models of T. By compactness, there is an 
embedding f:2— 9, where ¥, is countable, ¥<¥,, and f preserves 
immediate successors and predecessors. Using compactness again, there is 
a g:%,2>M, where Pe, is countable, P< M,, g preserves immediate 
successors and predecessors, and f°g is the identity on M. Repeating the 
construction back and forth countably many times we obtain two elemen- 
tary chains Pt = Mey < Mi, <--- and R= No < R, <--+ such that the unions 
of the chains are isomorphic. Thus by 4.14, T is complete. It follows that T 
is the complete theory of the integers with order, and T is decidable. 

Several important theories can be shown to be complete and decidable 
by this method. For example, the theory of the integers under addition 
(Pressburger), and the theory of real closed ordered fields (Tarski). The 
method can also be used to analyse all complete extensions of incomplete 
theories, for example the theory of Boolean algebras (Tarski). 

There are several ways of generalizing the Lowenheim-Skolem problem 
by considering properties other than the cardinality of the universe of a 
model. The first such problem, proposed by Vaught, concerns pairs of 
cardinals. Suppose one of the relation symbols of L is a unary relation U. 
By a (x, A)-model we mean a model M whose universe has cardinal « and 
whose interpretation of U has cardinal A. We shall state some 
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Léwenheim-Skolem Theorems without proof. Some of the proofs are very 
deep and use infinitary combinatorics. Assume L is countable in each case. 


4.17. THEOREM. If T has a («,A)-model and x =«'2=A, then T has a 
(«', A)-model. (This is a corollary of the Downward Lo6wenheim-Skolem 
Theorem.) 


4,18. THEOREM (CHanG and Keister [1973]). If T has a (x, A)-model, then 
T has a (x’',A')}-model whenever k = k'’= A'= dX”. 


4.19. THEOREM (VAUGHT [1965]). If T has a (x,A)-model where A <x, 
then T has an (w,, w)-model. 


4.20. THEOREM (CHANG [1965], GCH). If T has an (a,, w)-model and w, is 
a regular cardinal, then T has an (@q+2, Wa+1)-model. 


4.21. THEOREM (JENSEN [1972], V=L). If n is finite and T has an 
(@a+n Wa)-model, then T has an (wg+n, We )-model. 


4.22. THEOREM (VauGut [1965]). Assume A <x,2* <Kx,2" <x,.... If T 
has a (x, A)-model, then T has a («',A')-model whenever A'S «' 


Assuming the axiom of constructibility, V = L, the above results show 
that any “finite gap’? between cardinals can be translated, any infinite gap 
can be replaced by an arbitrary gap, and any gap can be narrowed. 
Counterexamples due to FUHRKEN [1965] show that these results are best 
possible, that is, a finite gap cannot be made wider. Here is the counterex- 
ample for the simplest case. 


4.23. Exampce. Let 9 be the ordered field of real numbers with an extra 
unary relation U for the set of rational numbers. XM is a (2°, w)-model. 
Since U is dense in Qt, the theory T of Dt does not have a (x, w)-model for 
« > 2°. More generally T has no (x, A)-model for x > 2’. 


The situation for two-cardinal analogues of the Upward and Downward 
Léwenheim-Skolem Theorems is complicated and leads to independence 
results in set theory (see CHANG and KEISLER [1973]). 
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5. Recursively saturated models 


The notion of a saturated model and the more recent notion of a 
recursively saturated model have greatly simplified and unified large parts 
of model theory. Intuitively, a saturated model is a model which has all 
possible types of elements. Saturated models were first introduced in an 
eastern version, which we shall call basically saturated models, by Jonsson 
[1960]. The western version of the notion was introduced by Mor.ey and 
VAUGHT [1962]. The definition and main results on recursively saturated 
models are due to BarwisE and ScuuipF [1976]. We shall reverse the 
historical order and discuss recursively saturated models in this section and 
saturated models in the next section. 

For simplicity we assume in this section that L has only finitely many 
relation, function, and constant symbols. We shall require only an intuitive 
knowledge of recursive sets. The main facts we need are that there are only 
countably many recursive sets, and that any set of formulas of L which can 
be described by a “‘finite scheme”’ is recursive. In particular, since L has 
only finitely many symbols, the set of all formulas of L is recursive. 

We use the notation @(x) to denote a set of formulas g(x) each with at 
most the single free variable x. ®(x) is said to be satisfiable in WM if there is 
an element m € M which simultaneously satisfies each g(x) € P(x). 


5.1. Examp.e. In ordered field theory the countable set of formulas 


{l<x,1+1<x,1+1+1<x,...} 


is satisfiable in Yt if and only if Mt is nonarchimedean. 


§.2. EXAMPLE. Consider the set of formulas 
{p(x) #0: p(x) a polynomial over Q} 


in the theory of fields of characteristic zero. This set of formulas is 
satisfiable in Dt if and only if Mt is not an algebraic extension of the field © 
of rational numbers. 


5.3. DEFINITION. A structure Dt is recursively saturated if and only if for 
every finite set Y C M, every recursive set of formulas ®(x) in Ly which is 
finitely satisfiable in Dy is satisfiable in My. 


From Example 5.1 we see that every recursively saturated ordered field 
is nonarchimedean. From Example 5.2, a recursively saturated field of 
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characteristic zero cannot be an algebraic extension of D. The advantage of 
using recursive rather than arbitrary sets of formulas is that it allows us to 
prove the following existence theorem. 


5.4. ExIsTENCE THEOREM. Every consistent theory T in L has a finite or 
countable recursively saturated model. 


Proor. Let Yt) be a finite or countable model of T. There are only 
countably many finite subsets Y C Mo, and for each Y there are only 
countably many recursive sets of formulas ®(x) of Ly. Thus by compact- 
ness, there is a countable elementary extension Vt, > Mt, such that each 
recursive ®(x) finitely satisfiable in an Ptoy is satisfiable in Dt,y. Repeat the 
construction countably many times. The union M= UzZ_,M, is then 
recursively saturated, countable, and a model of T. O 


Recursively saturated models have several pleasant properties. The next 
theorem is typical. 


5.5. THEOREM. Every recursively saturated model IR is w-homogeneous. 
That is, for any two finite tuples a,,..., a,,b1,...,6, in M such that 


(De, ai,..-, An) = (M, bi,..., da), 
for every a..,€ M there exists b,.,€ M with 
(M, ai, ..., Qn+1) = (Me, di, ..-, Basi). 


Proor. Let 
Y = {a),..., Anos, Ds,..., Dy}. 
The set of all formulas of the form 
(a1, ..-, Any An+i)— (Di,..., bn, x) 
is recursive and finitely satisfiable in ty, hence is satisfied by some 


bor O 


Countable w-homogeneous models I have the interesting property that 
whenever 


(Mt, as,..., Qn) = (Me, by, ..., Bn), 


the finite mapping a, > b, can be extended to an automorphism of M. 
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Many of the applications of recursively saturated models use model pairs 
rather than single models. Given a finite language L = {S,,...,S,}, let L’ 
and L” be two disjoint languages with symbols of the same type as L, 


L'={Sj,..., Sa}, L" = {S1,..., Sah. 
Consider two structures I, 9% for L with the same universe M = N. By the 


model pair (Dt, 3%) we mean the structure for L’U L” whose L’-reduct is I 
and whose L’-reduct is 3. 


5.6. ISOMORPHISM THEOREM. If (Qt, Jt) is a countable recursively saturated 
model pair and M=N, then M=MN. 


Proor. Going back and forth we construct enumerations 
M = {ao, ai,...}, N = {bo, bi, ...} 
such that for each k, 
(M, ai,..., ae) =(N, bi,..., bk). 


When k is even we first choose a, and then choose b, to satisfy in (Dt, Jt) 
the recursive set of formulas 


y'(ao, wee y Ak-ty a) o"(bo, or) byt, x), 


where (Xo,..., Xx) runs over formulas of L. When k is odd we choose 5, 
first and then a. O) 


The above proof is an example of an argumént which occurs frequently 
in model theory, called the back and forth construction. 

One of the fundamental results in logic is the Robinson Consistency 
Theorem (Rosinson [1956b]). Many proofs of this theorem have been 
found. Here is an especially simple proof which uses recursively saturated 
models. 


5.7. ROBINSON CONSISTENCY THEOREM. Let L, and L, be two expansions of 
the language Lo, with Lo = Li NL. Let Ty be a complete theory in Lo and let 
T\, Tz be consistent extensions of Ty in the languages L, and L2. Then the 
union T, U T2 is consistent. 


Proor. Using the Léwenheim-Skolem Theorem and the existence 
theorem 5.4, there is a finite or countable recursively saturated model pair 
(Mt, N) where M? is a model of T, and M a model of T2. Mt and M are 
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structures for L, UL. The reducts Qt, and MN, of Mi and M to Ly are models 
of To, so Po = Mo. Moreover, (Mo, No) is still recursively saturated, so by 
the isomorphism theorem 5.6, Dt) =o. This isomorphism gives us an 
expansion %t, of Jtp to Li which is a model of T;. By changing the 
interpretation of L, — L, on 9 to fit Jt), we obtain a model of T,U 7. O 


5.8. Exampce. If G is a finite group, let F(G) be the class of all. fields 2% 
such that the Galois group of 2 over some subfield N C M is G. Let T bea 
complete extension of field theory and let G, H be finite groups. If T has 
models 2, E F(G) and M,€ F(H), then T has a model ME F(G)N 
F(H). This is because F(G) and F(H) can be described by theories with 
extra symbols. 


Many similar examples can be easily constructed. 
The Robinson Consistency Theorem was also proved independently in 
the following equivalent form by Craic [1957]. 


5.9. CRAIG INTERPOLATION THEOREM. Let L=L'NL" and let o, be 
sentences of L and L". If p — wis valid, there is a sentence 0 of L such that 
yg — 0 and 6—> w are valid. 


Proor. Suppose there is no such @. Let Ty be the set of all consequences of 
¢ in L. Then T, U{— } is finitely satisfiable, and by compactness has a 
model . Let T be the complete theory of the L-reduct of M. T U{~} and 
T U{—} are both consistent, so T U{y, —y} is consistent. But this 
means that g— w is not valid. O 


We conclude this section with one more preservation theorem. A 
formula ¢ is positive if g is built up from atomic formulas using 4, v, V, 3. 
A homomorphism of M into N% is a mapping f of M into N such that for 
each atomic formula ¢ and assignment s in , 


ME gels] implies Nk y[s of]. 


The substructure of 3? whose universe is the range of f is called the image 
of the homomorphism f. We say that T is preserved under homomorphic 
images if whenever Xt is a model of T and f is a homomorphism of J into 
an arbitrary %, the image of f is a model of T. 


5.10. LYNDON HOMOMORPHISM THEOREM (LYNDON [1959]). A theory T is 
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preserved under homomorphic images if and only if T is equivalent to a 
positive set of sentences. 


Proor. The proof that positive formulas are preserved under homomor- 
phic images is by induction on thé length of positive formulas. Assume T is 
preserved under homomorphic images and let Tp be the set of all positive 
consequences of T. Let 3% be a model of Tp. By compactness there is a 
model 2X of T such that every positive sentence true in M is true in N. To 
avoid complications assume Jt and ¥ are infinite. Then there is a countable 
recursively saturated model pair (M’, N’) with M=M', NR=MN'. Using a 
back and forth construction we see that 3t’ is a homomorphic image of M’, 
whence Jt’F T. Hence T> is equivalent to T. O 


5.11. ExampLe. The theories of groups and rings are positive and are well 
known to be preseserved under homomorphic images. The theory of 
integeral domains is not preserved under homomorphic images (consider 
the integers and the integers modn), hence its axioms cannot all be 
positive. The culprit is the axiom which states that there are no zero 
divisors, 

Vx Vy (x-y =0—>x=O0vy=0). 


6. Large and small models 


In this section we shall study two kinds of models, saturated models 
(which are large), and prime models (which are small). We shall first 
present saturated models in the original eastern version due to Jonsson 
[1960], and then a parallel western version due to MorLEY and VAUGHT 
[1962]. The theory of prime models is more successful in the western 
version, and in its final form is due to VauGHT [1961]. The compactness 
theorem is the main tool in constructing large models. To construct small 
models we need a new tool, the omitting types theorem. 

Jonsson’s treatment applies to theories of the following kind. 


6.1. DEFINITION. T is a Jénsson theory if: 
(i) T has an infinite model. 
(ii) T is inductive. 
(iii) T has the joint embedding property, that is, any two models Dt, N of 
T are isomorphically embeddable in a model 8 of T. 
(iv) Ty has the amalgamation property. 
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6.2. EXAMPLE. Examples of Jénsson theories are: 
Groups, 
abelian groups, 
Boolean algebras, 
linear order, 
fields of characteristic p (prime or zero), 
ordered fields. 


The notation Ty denotes the set of all universal consequences of T. By a 
basic formula we shall mean a quantifier-free formula. Thus the diagram 
D (2M) is equivalent to the set of all basic sentences of Ly true in Diy. More 
generally given X C I we let D(M, X) be the set of all basic sentences of 
Lx true in Dx. It is useful to classify elements of Yt using the notion of a 
basic type. 


6.3. DeFinitTion. A basic type over T is a set of basic formulas ®(x) with 
one variable which is maximal consistent with T. Given an element m of a 
model Yt of T, the set of all basic formulas satisfied by m is a basic type, 
called the basic type of m. 


Notice that every set of basic formulas consistent with T can be extended 
to a basic type over T. 


6.4. ExampLe. Let T be the theory of fields of characteristic p (prime or 
zero). For each irreducible polynomial P(x) over the prime field there is a 
basic type consisting of all consequences of T U{P(x) = 0}, called the type 
of roots of P(x). T has just one more basic type, the type of the 
transcendental x. 


The theory of ordered fields has 2” basic types because there are 2° cuts 
in the rationals. 


6.5. Derinition. Let T be a Jonsson theory and « an infinite cardinal. A 
model I of T is basically x -saturated if for each set X C Pt of cardinal less 
than x, every basic type over T U D(Q, X) is satisfied in Dlx. (Equiva- 
lently, every set of basic formulas ®(x) consistent with TU D(M, X) is 
satisfied in Dx.) 


M is basically saturated if MW is basically «-saturated where « is the 
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cardinal of Yt (and « is infinite). The notion of basically x-saturated 
depends on the theory T, and gets stronger as x increases. 

We state three theorems of Jonsson [1960]. In each case we assume T is 
a Jénsson theory with at most « symbols. 


6.6. ExISTENCE THEOREM. T has a basically («*)-saturated model of cardi- 
nal 2". 


Proor. The proof is like the existence theorem for recursively saturated 
models. We form a chain of models of length «*, with models of cardinal 
2". At successor stages we use the amalgamation and joint embedding 
properties and the compactness theorem to satisfy 2" basic types over the 
previous model. O 


6.7 CoROLLARY (GCH). Thas a basically saturated model of cardinal x* 


6.8. UNIQUENESS THEOREM. Any two basically saturated models of T of the 
same cardinal are isomorphic. 


Proor. By a back and forth construction. 0 


6.9. THEOREM. A model M of T is basically x-saturated if and only if: 
(i) M is basically x*-universal. That is, any model of T of cardinal less 
than «* is isomorphically embeddable in Ut. 
(ii) Dt is basically x-homogeneous. That is, if 2 CM, N is generated by a 
set of cardinal less than x, and f : kM, then My and Myn satisfy the same 
basic types. 


The above theorem shows the sense in which basically saturated models 
are large. 


6.10. ExampLe. The basically w,-saturated linear orderings are the 7,-sets 
of HausporeF [1914]. 7.-sets (M, <) are dense in the strong sense that if 
X, Y are two subsets of cardinal less than w, and X < Y, there exists z € M 
with X<z< Y. 


It follows from Erpdos, GittMan and Henrickson [1955] that the 
basically w,-saturated ordered fields are exactly the real closed ordered 
fields whose orderings are 7.-sets. These fields are nonarchimedean. 
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The basically w:-saturated fields are exactly the uncountable algebrai- 
cally closed fields. 

If 2° = w,, then there isa unique basically saturated group of cardinal a. 
This group turns out to be divisible and simple. 

The western version of the preceding discussion is obtained by the 
substitutions: 


Jénsson theory — complete theory, 
diagram — elementary diagram, 
isomorphic embedding — elementary embedding, 
basic formula — formula. 


In all definitions the adjective ‘‘basic”’ is left off. Here are the high points. 


6.3’. DEFINITION. A type over a complete theory T is a maximal consistent 
set of formulas ®(x). 


6.5’. DeFIniTION. A structure Qt is «-saturated if for each XCM of 
cardinal less than x, every type over Th(Mx) is satisfied in Diy. M is 
saturated if it is x-saturated where « is the cardinal of MV. 


6.6’. EXISTENCE THEOREM. Every complete theory T with an infinite model 
has a («*)-saturated model of cardinal 2". 


6.8’. UNIQUENESS THEOREM. Any two elementarily equivalent saturated 
structures of the same cardinal are isomorphic. 


The proofs can either be carried out directly or obtained from the 
previous results using the conservative expansion T’ of Theorem 2.18. 
Recall that T’ is obtained by adding a new relation symbol for each 
formula of L. When T is a complete theory with infinite models, T’ is a 
Jénsson theory. Moreover, each Xt T has a unique expansion Wi’ T’, 
and types in Mx correspond to basic types in Wx. 

Every w-saturated structure for a finite language is obviously recursively 
saturated. Most applications of recursively saturated models are parallel to 
earlier applications of saturated models. Saturated models have many 
other applications. For example, see Chapter A.3. They are also used in the 
two-cardinal theorems of Chang and Jensen in Section 4, and in the Morley 
categoricity theorem in Section 7. One difficulty is that a countable 
saturated mode] of a theory need not exist, while a countable recursively 
saturated model always exists. 
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6.10’. ExampLe. The nm, sets and 7, real closed ordered fields are w,- 
saturated. Every uncountable algebraically closed field is w,-saturated. 
However, the basically saturated group of cardinal w, is not saturated. 


For the rest of this section we shall assume that L is a countable 
language, and study the countable models of complete theories in L. 

We denote an n-tuple x,,..., x, by x. We shall need to consider types in 
n variables. By a type in x over T we mean a set of formulas ®(x) maximal 
consistent with T. A type ®(x) is realized in a model % if it is satisfied by 
some m in Yt, and omitted in WM if it is not realized in Pt. For convenience 
we shall let countable mean finite or of cardinal w, and we consider finite 
structures to be saturated. 

We begin with a characterization of theories with countable saturated 
models. 


6.11. THEOREM (VAUGHT [1961]). Let T be a complete theory. Then T has a 
countable saturated model if and only if for each n, T has countably many 
types in n variables. 


Proor. Let T have a countable saturated model PM. Since M is w,- 
universal, every type in n variables over ¢ is realized in 2. Hence T has at 
most countably many types. The proof of the converse is like the existence 
proof for recursively saturated models. O 


6.12. CoRoLLARY. If a complete theory T has only countably many 
nonisomorphic countable models, then T has a countable saturated model. 


6.13. ExampLe. The theory T of algebraically closed fields of characteris- 
tic p has countably many countable models, one of each transcendence 
degree n or w. Therefore T has a countably saturated model. A model of 
finite transcendence degree omits any type of n + 1 algebraically indepen- 
dent elements. Thus the field of transcendence degree w must be the 
saturated model. 


6.14. Examp te. The theory of real closed ordered fields has no countable 
saturated model because there are 2° types in one variable. Similarly, the 
complete theory of the standard model of arithmetic has no countable 
saturated model, because we get 2° types by considering the standard 
primes which divide x. 
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We next prove the fundamental result which allows us to construct small 
countable models. The most natural statement uses infinite disjunctions 
and conjunctions. 


6.15. OmitTING Types THEOREM (HENKIN [1954], Orey [1956]). Let T be a 
consistent theory and let @nn(x), m,n < w, be formulas in x. Assume that for 
each formula (x) consistent with T and each m, there exists n such that 
W(X)A Pmn(X) is consistent with T. Then T has a countable model M such 
that 

ME A Wx V Gan (xX). 


(The length of the sequence x = x,+-* Xsum) is independent of n.) 


Proor. We argue as in the compactness proof. Add a countable new set of 
constant symbols C. By listing all sentences w of Lc and all pairs (m, c), we 
can construct a theory T’ in Le such that: 

(1) TCT". 

(2) T’ is complete in Le. 

(3) If dxw(x)e T', then w(c)€ T’ for some cE C. 

(4) For each ¢ in C and m < a, there is an n < w such that gn, (¢) € T’. 

T' has a unique model St’ such that every m € M’ is an interpretation of 
some c € C. The reduct PM of Mt’ to L has the required property. O 


The original papers on tne Omitting Types Theorem dealt with the 
special case of w-models of arithmetic. Let L contain among its symbols 
the constants 0,1,2,.... T is said to be w-complete if for every formula 
g(x) of L, 

TK 9(0), TE @(i),... implies TE Vx¢(x). 
The infinite rule of proof which states that Vx g(x) can be inferred from 


¢ (0), g(1),... is called the w-rule. A model M of T is said to be an w-model 
if every element of MY is equal to some constant #, that is, 


MEVx Vx=n. 


n<w 


6.16. w-COMPLETENESS THEOREM. If T is consistent and w-complete, then T 
has an w-model. 


Proor. If @(x) is consistent with T, then for some n<o, TK —@(n). 
Therefore @(x)ax =n is consistent with T, and the omitting types 
theorem applies. DO 
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The w-completeness theorem also holds in the more general case of 
w-logic, which has one sort of variables for natural numbers and another 
sort for elements (see 5.2 in Chapter A.1). 

The following result was proved for models of arbitary cardinality by 
MacDoweE .t and SpecKER [1961]. For countable models we give a simple 
proof using the omitting types theorem. 


6.17. THEOREM. Let It be a countable model of Peano arithmetic. Then M 
has a proper elementary extension St such that whenever m © M and 
n€& N-M we have m <n. 


Proor. Add a new constant c to Lm» and let T be the theory 
Th(Mu) U{m <c: m € M}. 
Using the omitting types theorem and the pigeon hole principle in Peano 


arithmetic, one can show that T has a model 3?’ such that 


NE A Vi(asxv Vv x=5): 


aEM b<a 


The reduct ¥t of Xt’ to L has the required property. O 


We have given two applications of the omitting types theorem to 
incomplete theories. We now return to our study of countable models of 
complete theories. 

6.18. DeFinition. A formula ¢(x) is said to be complete in T if the set 
{h(x): TE o(x)— o(x)} 
is a type over T. A type which is generated by a complete formula is called 


an isolated type. 


6.19. THEOREM (VAUGHT [1961]). Let T be a complete theory. T has a model 
which omits a type p if and only if p is not isolated. 


Proor. If p is the isolated type detrmined by a complete formula ¢ (x), 
then TF 4x ¢(x) and hence every model of T realizes p. Suppose p is not 
isolated. Then by the omitting types theorem, T has a model M such that 


MeEVx V my(x), 
pep 


hence M omits p. O 
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We are ready to study small models. 


6.20. DEFINITION. A prime model of T is a model Y of T which is 
elementarily embeddable in every model of T. 


Prime models are necessarily countable, and are the opposite of w,- 
universal models. The following results are due to VauGHT [1961]. 


6.21. THEOREM. Let T be a complete theory. Wt is a prime model of T if and 
only if M is countable and every type realized in M is isolated. 


Proor. Use Theorem 6.19. 0 


6.22. UNIQUENESS THEOREM FOR PRIME MODELS. Any two prime models of a 
complete theory T are isomorphic. 


The proof is again by a back and forth construction using 6.21. 


6.23. EXISTENCE THEOREM FOR PRIME MODELS. A complete theory T has a 
prime model if and only if every formula consistent with T belongs to an 
isolated type over T. 


ProorF. If Pt is a prime model of T, then any ¢(x) consistent with T is 
satisfied in 9% and hence belongs to an isolated type. Assume every formula 
consistent with T belongs to an isolated type. Let ®, be the set of all 
complete formulas in k variables over T. By the omitting types theorem, T 
has a countable model Mt such that 


ME AVx V —g(x). 
k very, 
Then by 6.21, Mt is a prime model. OF 


6.24. Coro.iary. If a complete theory T has a countable saturated model, 
then it has a prime model. 


Proor. Suppose T has no prime model. Let g(x) be consistent but not 
contained in an isolated type over T. Then there is a binary tree of formulas 
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Spey eo 
eu(x)--- 


consistent with T such that each branch of the tree can be extended to a 
different type. Hence T has 2° types in x and T has no countable saturated 
model. O 


6.25. EXAMPLE. By a process of elimination, the prime model of the theory 
of algebraically closed fields of characteristic p is the field of algebraic 
numbers. The field of real algebraic numbers is the prime model of the 
theory of real closed ordered fields. The standard model %t of arithmetic is 
the prime model of Th(J). 


As a culmination of our study of countable saturated models we obtain a 
characterization of w-categorical theories. 


6.26. THEOREM (Engeler, Ryll-Nardzewski, Svenonius; VAUGHT [1961]). 
Let T be a complete theory with no finite models. T is w-categorical if and 
only if for each n, T has only finitely many types in n variables. 


Proor. Assume T is w-categorical. Let It be the countable model of T. By 
the Downward Léwenheim-Skolem Theorem, YP is a prime model. Since 
every type over T is realized in Pt, 6.21 shows that every type over T is 
isolated. If T had infinitely many types in x the set of negations of 
complete formulas in x would be consistent and hence contained in a type 
which is not isolated. Therefore T has only finitely many types in x. 


Now assume that for each n, T has only finitely many types in n 
variables, say pi,..., Dm. The type p; contains the complete formula 
Q = G1 Accra Qm 


where 9; € p; — p; for ix j. Therefore each type in n variables over T is 
isolated. It follows from 6.21 that every countable model of T is prime, so 
by the uniqueness theorem for prime models, T is w-categorical. O 


6.27. EXAMPLE. The theory of dense linear order is w-categorical and 
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admits elimination of quantifiers. Thus for each n, the finitely many types 
in X,,...,X, are determined by the relative order of x1,..., Xn. 


Our treatment of prime models made strong use of the fact that if a 
formula ¢ (x) is consistent with a complete theory T, then it is satisfied in 
every model of T. Since Jonsson theories and basic formulas do not have 
this property, we do not get an analogous eastern theory for basically prime 
models. However, in Section 8 we shall obtain an eastern version of the 
omitting types theorem. 


7. Stable theories 


Mor_ey [1965] proved the following famous theorem, answering a 
question of Los. 


7.1, MORLEY CATEGORICITY THEOREM. Let T be a theory in a countable 
language L. If T is categorical in some uncountable cardinal x, then T is 
categorical in every uncountable cardinal i. 


Thus for countable L there are only two kinds of x«-categoricity, 
w-categoricity and w,-categoricity. (The result was later extended to 
uncountable languages by Rowbottom, Ressayre, and ultimately SHELAH 
[1974].) The proof of 7.1 has been considerably simplified but is still too 
technical to give here. However the methods have become more important 
than the theorem itself. Morley gave a beautiful and fruitful classification 
of types over a theory. We begin with an eastern version because it has 
natural examples trom algebra. To motivate the theory we look at 
extensions of fields and abelian groups. 

Consider a theory T and a model Yt of Ty. By a basic type over Mi we. 
mean a basic type in one variable over the theory T U D(M). A basic type 
is isolated if it is generated by a single basic formula. 

First let T be the theory of fields of given characteristic and let Dt be a 
model of T. The basic types over Yt are: 

(0) For each irreducible polynomial P(x) over Yt, an isolated type 
generated by the formula P(x) = 0. In an extension Jt D Pi, an element has 
one of these types if and only if it is algebraic over Dt. 

(1) A single nonisolated type, realized in an extension Jt D WM, by the 
transcendental elements over Dt. 

The types (0) are said to be algebraic, or of Morley rank zero, over Qt. 
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The type (1) is said to be of Morley rank one over M. If PEC N, each basic 
type over I can be extended to several different basic types over 2. 
However, for field theory it happens that if a basic type p over VW is 
isolated, then every extension q D p over ¥& is isolated. 

Now let T be the theory of abelian groups and let Yt T. Obviously, for 
each m EM the formula x = m generates an isolated type, which has 
Morley rank zero. Another type p is generated by the set of formulas 


{x4 m: m © M}U {2x = 0}. 


p is the type of x¢ M of order 2, and is isolated if and only if Yt has only 
finitely many elements of order 2. However, over an extension Jt > WM, p 
will always split into a set of isolated types over J{ and at most one 
nonisolated type. p is an example of a type of Morley rank 1. The type of x 
of order 4 such that 2x M has Morley rank 2, and so on. The type of x 
such that nx¢ M for all n has Morley rank o. 

We are ready for the general definition. It is easier to define the Morley 
rank of a basic formula. 

Until further notice we assume that T is a Jonsson theory in a countable 
language. Recall from Section 3 that Ji Ty iff Mt is a substructure of some 
Ne T. 


7.2. DEFINITION. Let Dt be a model of Ty. The (basic) Morley rank m(¢) of: 
a basic formula g(x) of Lm is defined inductively by: 
(i) m(g)= —1 if g(x) is inconsistent with TU D(M). 

(ii) m(~)=0 if for each CRE Ty, ~(x) belongs to finitely many 
basic types over 9. 

(iii) Let @ be an ordinal. m(g)=a if for each MCMNE Ty, the set 
{p(x}} U{a (x): & has rank <a over Jt} can be extended to finitely 
many basic types over 3. 

(iv) m(g) =~ if m(g)A# —1 and m(¢)A a for all ordinals a. 

Here “‘finitely many” means at least one but fewer than w. 

Notice that if TF g(x)— w(x) then m(¢) = m(w). 


7.3. DEFINITION. The Morley rank of a basic type p over W is the least a 
such that p contains a formula of Morley rank a. 


We begin with a lemma showing that Morley ranks are well behaved. 


7.4, Lemma. Let R@CMNE Ty, and let p(x) be a basic formula of Lu. 
(i) The Morley ranks of p(x) over M and over N are equal. 
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(ii) Let m(g)=a<. Then ¢ belongs to finitely many basic types of 
rank a and no basic types of rank >a over N. 

(iii) If p has rank a <~ over M, then p can be extended to finitely many 
basic types of rank a and no basic types of rank >a over N. 


Proor. (i) is by induction on rank using the amalgamation property of Ty. 
The implications (i) — (ii) and (ii) > (iii) follow easily from the definitions. 


7.5. DEFINITION. The (basic) Morley rank of T is the Morley rank of the 
true formula x = x over any model Pit Ty. Equivalently, the Morley rank 
of T is the maximum rank of a basic type over a model ME Ty. 


Mor ey [1965] calls theories of rank <© totally transcendental. 


7.6. EXAMPLE. The theory of fields of given characteristic has Morley rank 
one. The theory of equivalence relations (M, E) has Morley rank two. (For 
each m E M, the type of x M with E(x, m) has rank one, while the set of 
formulas {— E(x, m): m € M} generates a type of rank two.) The theories 
of abelian groups and differential fields of characteristic zero have Morley 
rank w. The theory of linear order has Morley rank ~. A linear ordering M 
has the following types: for each m € M, a type of rank zero generated by 
x =m; for each Dedekind cut of type of rank ©. 

Perhaps the best way to understand the definition of the Morley rank is 
in topological terms. It is closely related to the Cantor-Bendixson deriva- 
tive. In Section 3 we explained the compactness theorem with a topological 
space whose points are complete theories. Now we form a topological 
space whose points are types. 

Given a Jonsson theory T and a model Wt of Ty, let S(M) be the set of 
basic types in one variable over Yt. For each basic formula ¢ (x) of La, let 


[pe] ={p © SQN): ¢ € p}. 


We make S(2) into a topological space, the basic Stone space over Mt, by 
taking as a closed basis the sets of the form [g]. It follows from the 
compactness theorem that S(2) is a compact totally disconnected Haus- 
dorff space. A point p € S(M) is isolated in the sense of being generated by 
a single basic formula if and only if it is isolated in the topological sense. 
Thus if [¢] is finite, then every p € [¢] is isolated. The Cantor—Bendixson 
derivative of S(Dt) is the space 


S(MYy = {p € S(M): p is not isolated}. 


Iterating the process through the ordinals, we define 
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S(M) = SM), 
SQM)**" = (SQM)*)' 


S(M* = - SM)? for limit a, 
B<a 


s@y = J Sey. 


For each a, S(2t)* is a closed set. The intersection S(Xt)” is called the 
perfect kernel of S(M). Finally, for each p € S(M), the Cantor—Bendixson 
rank of p is the greatest a such that p € S(M)*. 

The Morley rank differs from the Cantor-Bendixson rank because the 
Morley rank involves arbitrary extensions Jt D Mt instead of Mi itself. One 
way to define the Morley rank topologically is to use the category of basic 
Stone spaces of models (see Sacks [1972]). A simpler way is to take Mt to be 
basically w,-saturated. 


7.71. THEOREM. Let Wt be a basically w,-saturated model of T. Then for 
every basic type p€S(X), the Morley rank of p is equal to the 
Cantor-Bendixson rank of p. 


Proor. To get an idea of what is going on we give the proof for rank zero. 
If p has Morley rank zero, then it is isolated over every 2D M, hence 
isolated over Yt and of Cantor-Bendixson rank zero. Suppose p has 
Morley rank greater than zero, and let p € [g¢]. Then ¢(x) has Morley rank 
greater than zero, so there is an 9t such that ¢(x) belongs to infinitely many 
basic types over Jt, say qo, qi, q2,.-.. These types can be distinguished by 
countably many formulas, so we may take Jt to be countable. Since M is 
basically w,-saturated we may take Jt C Mt. Therefore [¢] is infinite, p is 
not isolated, and p has Cantor—Bendixson rank >0. (1 


7.8. LemMa. Let Wt be a basically w,-saturated model of T. A basic formula 
g(x) over IM has Morley rank © if and only if there is a binary tree of basic 
formulas 


Pol(x):** 
Po(x) 


ore ~ poi(x) +++ 
p(x) 
PS a P10) 
g(x) 


Gu(x) tee 
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such that each branch is consistent with T U D(Q) but no basic type belongs 
to more than one branch. 


Proor. The proof is purely topological, and the lemma is a classical result 
on perfect kernels. For some a, S(Q)” = S(M)* = S(M)**', so S(M)* has 
no isolated points. Therefore if m(¢)=%, [¢]N S(2)” is infinite and 
may be split repeatedly to obtain a binary tree with root ¢(x). 

For the converse suppose m(g) = a < © and there is a binary tree with 
root @(x). We may assume a is the least such ordinal. Since [¢] contains 
only finitely many points of rank a, there must be a formula ¢,(x) in the 
tree of rank less than a. But ¢,(x) is also the root of a binary tree, so we 
have a contradiction. O 


The next result of Mortey [1965] characterizes totally transcendental 
theories in terms of the size of the Stone spaces. 


7.9. DeFInition. A theory T is basically x-stable if for every model Xt of 
Tv of cardinal x, the basic Stone space S(t) has cardinal x. 


7.10. THEOREM. T has Morley rank <© if and only if T is basically 
w-stable. 


Proor. If T has Morley rank ~, then by 7.8 there is a binary tree of basic 
formulas over a model XY of T. Take a countable submodel CM 
containing all the constants which occur in the tree. Since the tree has 2° 
branches, there are 2” basic types over Jt, and T is not w-stable. 

Now suppose T is not w-stable. Then S(t) is uncountable for some 
countable Dt Ty. Each g(x) such that [¢] is uncountable can be split into 
two disjoint basic sets [g,] and [g.] which are still uncountable, because 
there are only countably many formulas over Yt. Hence there is a binary 
tree of basic formulas over #@. O 


7.11. Coro.iary. If T is basically w-stable, then the Morley rank of T is 
countable ordinal. 


Proor. By induction on the number of generators, there are only count- 
ably many nonisomorphic finitely generated models 2% Ty. Therefore 
there are only countably many different Morley ranks of basic formulas 
over models of Ty. We can see from the definition that if there are no 
formulas of rank a, then there are none of rank > a. Hence every formula 
has countable rank. O 


cH. A.2, §7] STABLE THEORIES 87 


Notice that if T is basically «-stable where « is regular (perhaps w), we 
can construct a basically saturated model of T of cardinal x, even without 
the GCH. The work of Shelah has shown that «-stability is of fundamental 
importance. He used the following classification of theories. 


7.12, DEFINITION. The basic stability spectrum of T is the class of all 
infinite cardinals « such that T is basically «-stable. 


7.13. THEOREM (SHELAH [1971]). Every theory T has one of the following 
four basic stability spectra sr. 
(i) sr ={x: wo = x} (T is basically w-stable). 
(ii) sr ={«: 2° <x} (T is basically superstable). 
(iii) sp = {«: « =x°} (T is basically stable). 
(iv) sr is empty (T is basically unstable ). 


The first step of the proof, due to Mor.ey [1965], proceeds as follows. 
Assume T is not basically «-stable. As in the proof of Theorem 7.10, there 
is a binary tree of basic formulas over some model of T, hence T is not 
basically w-stable. Thus w-stability implies «-stability for all «. The 
remaining steps are much harder. 


7.14. Examp_e. The theory of countably many unary relations is basically 
superstable but not w-stable. The theory of R-modules where R = 
Z@Z@Z®@--- is basically stable but not superstable. So is the theory of 
differential fields of prime characteristic p with a symbol for the p"™ root 
(SHELAH [1971]). 


A finer classification of theories is obtained using the stability function of 
T. 


7.15. DeFiniTion. The basic stability function of T is the function f; on 
infinite cardinals x defined by 


fr (x) = sup{| $(M)|: Me is a model of T of cardinal x}. 


The basic stability function of linear order is denoted by ded(x ). Thus 
ded(« ) = sup{A: There is a linear order 


of cardinal x with A Dedekind cuts}. 
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7.16. THEOREM (KEISLER [1976]). Every theory T has one of the following six 
basic stability functions f;: 
(i) fr(k)=« (T is basically w-stable). 

(ii) fr(«) = « +2° (T is basically superstable ). 

(iii) fr (x)= «” (T is basically stable). 

(iv) fr(«) = ded(«) (T is basically ordered). 

(v) fr(k) = (ded x)” (T is basically multiply ordered). 

(vi) fr(«) = 2" (T is basically independent). 


Each kind of theory has a syntactical characterization which suggests the 
name given to it. The theories (iv), (v), and (vi) are basically unstable. 


7.17. Example. The theories of linear order and ordered fields are basi- 
cally ordered. The theory of countably many linear order relations is 
basically multiply ordered. The theories of groups, Boolean algebras, and 
rings are independent. For more examples see KEISLER [1976]. 


The classification theorems 7.13 and 7.16 hold not only for Jénsson 
theories but for arbitrary theories T with a slight change in the definition of 
the stability function. 

Everything we have done in this section also holds in a western version. 
We assume that T is a complete theory in a countable language and 
everywhere remove the word “‘basically”’. 

A very rough plan of the proof of the Morley categoricity theorem is as 
follows. Assuming T is «-categorical for « >, first show that T is 
w-stable. Then show that the model of cardinal « is saturated. Next, prove 
that if T has an uncountable model which is not saturated, then T has a 
model of cardinal «x which is not saturated. Hence all uncountable models 
of T are saturated. 

Here is a characterization of w, categorical theories in terms of stability. 
The necessity is due to Mor-ey [1965] and the sufficiency to BALDwin and 
LacHLan [1971]. 


7.18. THEOREM. For a complete theory T to be w,-categorical it is necessary 
and sufficient that: 

(i) T is w-stable; 

(ii) for every NF T of cardinal w,, every definable set in Wy is either 
finite or uncountable. 


The cardinality spectrum of a complete theory T is the function 
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gr («x)= number of nonisomorphic models of T of cardinality x. A very 
difficult problem is to determine which functions are cardinality spectra. 
The Morley categoricity theorem shows that for uncountable x, g7 (x) is 
either always or never equal to one. Much more progress has been made 
using stability, particularly by Shelah. We state two typical results. 


7.19. THEOREM (LACHLAN [1973]). If T is superstable but not w-categorical, 
then T has exactly w nonisomorphic countable models. 


Here is a result at the opposite extreme from categoricity. 


7.20. THEOREM (SHELAH [1971]). If T is not superstable, then T has 2* 
nonisomorphic models in every uncountable cardinal x. 


7.21. Exampce. If 9% is an infinite linear order, or an infinite Boolean 
algebra, then Th(2) is unstable and therefore has 2" nonisomorphic 
models of each uncountable cardinal x. The result fails for « = w; for 
example, the theories of dense linear order and atomless Boolean algebras 
are unstable but w-categorical. 


Some other important applications of stability are discussed in Chapters 
A.4 and A.5. 


8. Model-theoretic forcing 


We shall describe A. Robinson’s finite forcing (RoBINson [1970]). It is 
analogous to Cohen forcing but is a simpler construction which gives 
models of an arbitrary inductive theory. Forcing is closely related to the 
omitting types theorem in Section 6, and in fact yields an eastern version of 
the omitting types theorem (Theorem 8.15). For a more general theory of 
forcing which includes Cohen and Robinson forcing as well as the omitting 
types theorem, see KEIsLER [1973]. 

The material in this section belongs to eastern model theory. 

Assume throughout this section that L is a countable language and C isa 
countably infinite set of new constant symbols. T will always denote a 
consistent theory in L. 


8.1. DEFINITION. By a condition for T we mean a finite set of atomic and 
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negated atomic sentences of Lc which is consistent with T. p, q,... denote 
conditions for T. Notice that the empty set @ is a condition for T. 


8.2. DEFINITION. The relation p forces y, denoted by p \t @ or if necessary 
pltrg@, is between conditions p and arbitrary sentences ¢ of Le. It is 
defined inductively on the length of ¢ by: 


If g isatomic, pity iff pep. 
pit @_ iff there is no condition q 2p with ql ¢. 
pltAxg(x) iff plt¢g(c) forsome cEC. 
pltevam iff plke or pity. 


In the above definition we take —, v, J to be the fundamental symbols 
and regard 1,—>,<,V as abbreviations. The definition of forcing is just 
like the definition of satisfaction except for the — clause. Intuitively, we 
can think of a condition p as a finite amount of information about a 
structure I. We say that p weakly forces ¢, p\t’ g, if p|IkK 7. 


8.3. Lemma. (i) If p|lt ge and p Cq, then q lrg. 
(ii) If g € p, then pit ¢. 
(iii) p Ik" g if and only if for allq Dp there exists rDq with rig. 
(iv) pl @ implies pF’ @. 


PRooF. (i) is by induction on the length of g, while (ii}(iv) are 
immediate. O 


To construct models by forcing we introduce the notion of a generic set. 


8.4. DEFINITION. A generic set for T is a set G of basic sentences where: 

(i) Each finite p C G is a condition for T. 

(ii) For each sentence g of Lc- there is a condition pC G such that 
either plk g or pit @. 


Gitg means that p|!t g for some pC G. It follows that for each ¢, 
exactly one of Gilt g, Gilt —@ holds. 


8.5. Lemma. For each condition p there is a generic set G D p. 


PRooF. Let go, ¢1,... be a list of all sentences of Lc. Using the negation 
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clause we can choose a chain of conditions po C p: C --- such that for each 
Nn, Pn lt Pa OF Pnlk Gn Then G = U,.p. is generic. [1 


The preceding lemma is our reason for assuming that L and C are 
countable. We now come to the main result of the construction. 


8.6. GENERIC MODEL THEOREM. Let G be a generic set for T. There is a 
unique (up to isomorphism) structure W(G) for Le such that: 

(i) each m € M(G) is the interpretation of some c € C, 

(ii) for each sentence ¢ of Le, 


M(G)Egp iff Gig. 
That is, 
Th(M(G)) = {g: Git g}. 


Proor. The relation 
c~d iff Glkc=d 


is an equivalence relation on C. For the universe of Pt(G) we take a set 
M(G)CC containing exactly one element of each equivalence class. 
When c ~ m € M interpret c by m, and let Dt(G) be the unique model of 
G with universe M(G). Then (ii) holds for each atomic ¢. It follows by 
induction on the length of g that (ii) holds for all g. The fact that G is 
generic is used in the negation step of the induction, 


M(G)Ene iff D(G)Ke iff Gk¥e iff Glkrge O 
The next corollary is useful in showing that sentences are forced. 
8.7. CoROLLARY. The following are equivalent: 
(i) pir’ g¢. 
(ii) For every generic set G for T with pC G, DUG)F ¢. 


We turn our attention to T-generic models. 


8.8. DeFINition. A structure Jt for L is said to be a T- generic model if Vt is 
isomorphic to the L-reduct of Yt(G) for some generic set G for T. 


8.9. THEOREM. If T is an inductive theory, then every T-generic model is a 
model of T. 
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Proor. We may assume T is a set of VA sentences. Consider a sentence 
gp = Vx Ayb(x, y)ET 
where w& is quantifier-free. We write g as 
45x 7 35y (x, y). 


Suppose ¢ is not true in some 2(G) where G is generic for T. Then some 
p © G T-forces dx “Ay w(x, y), so 


pit Ay v(c,y), 


for some c. But p is consistent with T, so some q D p must imply (ce, d) for 
some d in C. Then 


q It” By wc, y), 


a contradiction. We conclude that ¢ holds in every T-generic model. O 


To get an idea of the nature of T-generic models we shall look at some 
examples and introduce the notion of an existentially closed model. 


8.10. ExampLe. Every generic linear ordering is isomorphic to the ration- 
als. Every generic ordered field is real closed. Every generic field is 
algebraically closed. These examples can be verified directly using the 
definition of forcing. 


8.11. Derinition. A model Yt of T is existentially closed in T if every 
existential sentence yg of Ly which holds in some model of T extending 
holds in M. 


It can be shown that the models described in Example 8.10 are 
existentially closed. The notion of an existentially closed model is a fruitful 
generalization of the notion of an algebraically closed field. 


8.12. Proposition. If T is inductive, every model of T can be extended to an 
existentially closed model. 


The proof of this result is completely elementary, using only the fact that 
T is closed under unions of chains. 


8.13. THEOREM. If T is an inductive theory, every T-generic model is 
existentially closed in T. 
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Proor. Let M be T-generic, so M is the L-reduct of ¥(G) for some 
generic G. Let 3x¢(x) hold in some model SF T extending Mt, where 
(x) quantifier-free in Le. Let p C G and suppose the constant ¢ does not 
occur in p or g(x). Using the diagram of Jt we can find a condition q D p 
such that 


TUqF e(e). 
By 8.7 and 8.9, 
q |t* ax g(x). 


Therefore p cannot force — 4 74x ¢(x). It follows that G lk 3x ¢(x), so 
3x (x) holds in M(G). O 


The treatment of forcing is continued in Chapter A.4, but in a more 
general context. That chapter has several basic results for the present 
special case. For example, Theorem 8.13 has the following converse. 


8.14. THEOREM. Let T be an inductive theory. If the class of all existentially 
closed models of T is elementary, then every countable existentially closed 
model of T is T-generic. 


Thus every countable real closed ordered field is a generic ordered field, 
and every countable algebraically closed field is a generic field. The case of 
groups has been extensively studied. There are countable existentially 
closed groups which are not generic (Chapter A.4). 

Here is the eastern form of the omitting types theorem. 


8.15. Basic OmiTTING Types THEOREM (Macintyre [1972b]). Let T be an 
inductive theory, and let Pnn(x), m, n < w, be existential formulas. Supposé 
that for every condition p for T, each ec in C, and each m < w, there is an 
n<w such that 


T UP Ul{Gmn(€)} 


is consistent. Then T has a model Wt which satisfies the infinite sentence 
Nin WX Vin Pmn(X)- 


Proor. We may choose a sequence of conditions po C p:C--- such that 
(1) G= U,p, is generic. 
(2) For each m <w and each ¢ in C, we have » lt @un(e) for some 
kn<o. 
Then the reduct of 9%(G) has the desired properties. O 
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The proof of 8.15 shows that the model 2% can be taken existentially 
closed in T. 


8.16. Example. Let T be an inductive theory containing the group axioms. 
Suppose that for each condition p for T and each c € C there is an n such 
that p U{nc = 0} is consistent with T. Then T has a model which is a 
torsion group. 


8.17. ExampLe. Let T be an inductive theory containing the ordered field 
axioms. Suppose that for each condition p for T and each c € C there is an 
n such that p U{c <n} is consistent with T. Then T has an Archimedean 
model. 


8.18. THEOREM (Macintyre [1972b]). Suppose M is a group generated by 
{a,,...,a,}. If M is isomorphically embeddable in every existentially closed 
group, then the set E of equations in a,,...,a, true in M is recursive. 


Proor. Suppose E is not recursive. Let (x) be the set of basic sentences 
not satisfied by a,,...,a, in M. Then ®(x) is not recursive. Since g € ® iff 
(—¢)¢ ®, H(x) as not even recursively enumerable. We shall find an 
existentially closed group N such that 


NEVx V g(x). 
ged 


M cannot be isomorphically embedded in such an N. Let p be a condition 
for group theory T and let ci,...,c, € C. We claim that there isa p © ® 
such that p U{¢(c)} is a condition. For otherwise 


O(x)={7 (x): TUpFE v(c)} 


and thus ®(x) would be recursively enumerable. It follows by the basic 
omitting types theorem that the desired group N exists. O 


The T-generic models as defined here are necessarily countable. Bar. 
wise and Rosinson [1970] gave a more general definition which also allows 
uncountable T-generic models. We have presented Robinson’s finite 
forcing. There is a parallel theory of infinite Robinson forcing, which is 
sketched briefly in Chapter A.4. 
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9. Infinite formulas and extra quantifiers 


The restriction to first order logic is a severe limitation if we want to 
apply model theory to other parts of mathematics. Chapter A.1 gives 
several examples of concepts which are more naturally treated in stronger 
languages. The theorem of Lindstrém (in Chapter A.1) shows that first 
order logic is the only logic for which the compactness and 
Léwenheim-Skolem theorems hold. Nevertheless, there are two highly 
successful extensions of model theory to stronger logics. In model theory 
for infinitary logic we get by without the compactness theorem. In model 
theory with extra quantifiers we keep the compactness theorem but change 
the notion of a structure. We shall briefly discuss some of the methods 
which are available in these logics. 


9.1. DeFinition. Let « be a regular cardinal. The logic L,.. is the set of 
formulas built up according to the following rules: 
(i) Every atomic formula of L, with variables x,,@ <«, is in L,.. 
(ii) If g, # are in L,., so are 9,  v &, and Axg. 
(iii) If ® is a set of formulas of L,.. of cardinality less than x, then V @ is 
in Ly. 


We write L... = U, L,., and for A singular, Lu. = Ce ese Gee 


9.2. ExAmpLe. L,, is ordinary first order logic. L.,.. is like first order logic 
but allows countable disjunctions. 


The finite symbols ,,->,<, and V, and the infinite conjunction A , may 
be introduced as abbreviations. 


9.3. Lemma. Each formula of L.. has fewer than «x subformulas. 


The notion of a subformula is defined in the natural way. (The separate 
definition of L,.. for singular « is needed to make this lemma true.) The w 
in L,.. indicates that finite quantifiers are allowed. More general languages 
L,, allow quantifiers over sets of fewer than A variables. 

In our discussion we shall concentrate on the two most important cases, 
L... and L,... 

In L.., a Structure is completely determined up to isomorphism by its 
diagram. 
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9.4. ExampLe. Given a structure 9%, we have J{=M if and only if 9 
satisfies the L.., sentence 


AD(QMaVvx Vx=m 
meEeM 


L.., becomes more interesting when we work in the original language of 
% without adding symbols for the elements of DP. 


9.5. ExamPLe. The notion of an w-saturated model is expressible by the 
following sentence of L...: 


A A [vx AN Ay (@irs:*Aga) aay A e| 
® eee 


n<w O(x, y)GL Plseees @nE 


9.6. ExAmpLe. Every well-ordering (a, <) can be characterized up to 
isomorphism by a single sentence ¢, in L.., (with only the relation symbol 
<). 


We first define, by induction on a, formulas w(x) stating that the set of 
predecessors of x has order type a: 


bo(x) = T3Ayy <x, 
UoX) = A By (y <x Ada(yaVy(y<x—> V daly) 
Then we define y, by 
Qa = (linear order) a fe Ax we(x) a Vx as te (x). 


The notion of a well-ordering can be defined by the following sentence of 
Lienert 


(linear order) A 7 (Bx0X1X2°°*) A Xnsi< Xn 
n<w 


However, it cannot be defined by a sentence of L.... 


9.7, THEOREM (Lopez-EscosBar [1966]). Every sentence of L.. which has 
arbitrarily large well-ordered models has a nonwell-ordered model. 


The back and forth construction is useful in L... and provides a 
characterization of L.., equivalence. 
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9.8. DEFINITION. Yt and N are L...-equivalent, in symbols =... N, if they 
satisfy the same sentences of L..... 


9.9. DEFINITION. A partial isomorphism I : =, is a relation I on the set 
of finite sequences (m,n) of elements of M x N such that: 
(i) OT. 
(ii) If m In, then (M, m) and (M, m) satisfy the same atomic sentences. 
(iii) If mIn then for all a@M there exists bEN such that 
(m, a) I (n, b), and vice versa. 


9.10. THEOREM (Karp [1964]). M=... % if and only if M and N are 
partially isomorphic. 


The following result is analogous to Lindstrém’s Theorem (see Chapter 
A.1), and characterizes the logic L... 


9.11. THEOREM (BARWISE [1974]). L.. is the only logic closed under 
A,—,34 which satisfies the nonwell-ordering Theorem 9.7 and Karp’s 
Theorem 9.10. 


The following examples show that in general L.., equivalence does not 
imply isomorphism. 


9.12. Examp_Le. By Theorem 9.10 the following are L..., equivalent, where 
X and Y are infinite sets. 
(P(X), ©) and(PA(Y), C) where A(X) is the set of all subsets of X. 
G(X) and G(Y) where G(X) is the group freely generated by X. 
F[X] and F[Y] where F is a field and F[X] is the ring of polynomials 
over F in X. 
Any two models of a complete w-categorical theory in L. 
Any two w-saturated models of a complete theory in L. 


BarwiseE and Ek.orF [1970] generalized the Ulm invariants to uncount- 
able abelian groups and showed that two reduced torsion abelian groups 
have the same UIm invariants if and only if they are L.., equivalent. For 
additional applications of partial isomorphisms see BARwisE [1973]. 

One trouble with the language L..,, is that the formulas form a proper 
class rather than a set. Often one works with a set of formulas closed under 
subformulas. When we do this, the Downward Léwenheim-Skolem 
Theorem and the elementary chain theorem generalize to L... 
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One more construction which is available in L.., is the notion of an 
Ehrenfeucht-Mostowski model, which is discussed in Chapter A.5. 

The model theory for L.,,., is much more sucessful than that for L.... The 
key method is model-theoretic forcing, which we shall extend to L.,,... The 
method is equivalent to an earlier construction of Makkai called a 
consistency property (see KEIsLer [1971]). 


9.13. ExampLe. The notions of an Archimedean ordered field and of a 
torsion group are expressible by sentences in L.,,... So are the notions of a 
recursively saturated model, a prime model, and an w-homogeneous 
model. In Section 6 we already used a sentence of L.,., to state the omitting 
types theorem. The sentence Wx V,<..x = n holds in M if and only if WM is 
an w-model. From these examples, first order model theory leads quite 
naturally to L,,.. 


Assume hereafter that L has countably many symbols. 
Theorem 9.10 of Karp has the following sharper form for countable 
models. 


9.14. THEOREM (Scott [1965]). For every countable structure Wt there is a 
sentence ~ of L.,. such that for all countable N, 


N=M iff Ney. 


Let C be a countably infinite set of new constant symbols. We shall 
generalize model-theoretic forcing by replacing the set of sentences of L- 
by a well behaved set of sentences of (Lce).,.. 


9.15. DEFINITION. By a forcing base we mean a countable set S of (Lc)... 
sentences such that: 
(i) Each atomic sentence of Lc belongs to S. 
(ii) Each ¢ € S contains only finitely many c EC. 
(iii) If #(x) is a subformula of some ¢ € S and c isin C, then #(c)E S. 
(iv) If ge E S and ~ does not begin with —, then (~@)ES. 


9.16. DEFINITION. A forcing property on S is a nonempty set P of consis- 
tent finite subsets p C S such that 
(i) p€ P and q Cp= implies q € P. 
(ii) If pH Axg and g(c)ES, then p U{g(d)} © P for some dE C. 
(iii) If V ® € p, then p U{~} © P for some g € @. 
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Using the elements p € P as conditions, the treatment of forcing goes 
exactly as before. The definition of forcing has a new clause for infinite 
disjunction, 

pit\V@® iff plte forsome gE®. 


Robinson forcing as developed in Section 8 is the special case where the 
conditions p € P are finite sets of atomic and negated atomic sentences 
consistent with T. In the present more general setting the conditions may 
contain sentences with connectives and quantifiers. The definition of a 
forcing property insures that if g € p then p weakly forces . By a generic 
set we mean a set G C S such that each finite p C G is a condition in P, and 
for each g € S we have either GI g if G lk —g. Using the countability of 
S we see that every p € P can be extended to a generic set G. 


9.17. GENERIC MopEL THEOREM. For every generic set G there is model 
M(G) such that for all go ES, 


M(G)E ge iff Gike. 


The Generic Model Theorem is as useful in L.,.. as the compactness 
theorem is in first order logic. For example, the preservation theorems for 
substructures (Malitz), and homomorphic images (Lopez-Escobar), the 
Craig interpolation theorem (Lopez-Escobar), the two cardinal theorem 
from (x,A) to (@,@) (Keisler), the omitting types theorem, and the 
existence theorem for prime models can all be extended to L.,,. using the 
forcing construction. 

The Generic Model Theorem can also be used to prove the following 
completeness theorem for L.,.. 


9.18. Axioms for L.,,.. 
All axiom schemes for L. 
A@®-— ¢ for each ¢ € @. 


9.19. Rules for L.,. 
All rules of inference for L. 
From {wy —> ¢: @ € ®} infer p> A®. 


We allow proofs of length less than a. 


9.20. COMPLETENESS THEOREM FOR L.,,. (KARP [1964]). A sentence ¢ of L.,. 
is provable if and only if it is valid. 
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Proor. If @ is provable it is obviously valid. Suppose ¢ is not provable. 
Let S be a forcing base with ¢ € S, and let P be the set of all finite pC S$ 
such that p is formally consistent. From the axioms it follows that P is a 
forcing property. Since {— ¢} is a condition, —@ has a generic model M, 
hence ¢ is not valid. O 


There is also an Upward Léwenheim-Skolem Theorem for L.,,.. It is 
stated and proved for the special case of w-logic in Chapter A.5. 

One can do still more infinitary model theory by choosing the language 
more carefully using the notion of an admissible set (see Chapter A.7). The 
restriction of L.... to an admissible set A is denoted by L,. When working 
with L, one can bring in methods from generalized recursion theory. The 
most important result is the Barwise compactness theorem. For more 
about L.,,. and L, see KEIsLer [1971] and Barwise [1975]. 

We conclude with a few words about extra quantifiers. Formally, the 
language L(Q) is like first order logic but has an extra quantifier Q and the 
tule: 


If g is a formula, so is Qx¢g. 


These quantifiers were first studied by Mostowski. A structure for L(Q) has 
the form (2, q) where M is a structure for L and q isa set of subsets of Mt. 
The definition of satisfaction has the extra clause 


MEQxe iff {mEM: Me e(m) eq. 


When we consider arbitrary structures, the model theory for L(Q) is much 
like the model theory for L with few new difficulties. However, in the 
interesting applications of L(Q) we restrict the class of models to reflect an 
intended interpretation of Q, and new problems arise. 

One natural interpretation of Q is ‘“‘there exist infinitely many”’. For this 
interpretation, we consider only structures (Yt, q) where q is the set of 
infinite subsets of Yt. This language is similar to w-logic and is equivalent 
to a sublanguage of L..... 

A second interpretation of Q is ‘‘there exist uncountably many”. We call 
this language L(Qw,). It is not part of L.,. but has a very well behaved 
model theory. VauGut [1965] and FuHRKEN [1965] showed that the 
compactness theorem holds for countable theories, and the set of valid 
sentences is recursively enumerable. In KEIsLER [1970] there is a complete- 
ness theorem with the simple set of axioms listed in Chapter A.1. The proof 
uses a form of the omitting types theorem, which also generalizes to 
L(Qw,). Bruce [1975] has extended Robinson forcing to L(Qw,). It turns 
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out that the combined language L.,,. (Qw,) also has a well behaved model 


theory. 

Another successful interpretation of Q is aimed at studying topological 
models. In the logic L(Qzpen) one considers models (D, q) where q is the set 
of open sets of a topology on Mt. Thus Qx p(x) means “the set of x such 
that ~(x) is open”. SGRo [1976a] has proved the compactness theorem for 
L(Qopen) uSing ultraproducts, and the completeness theorem with the 
following axioms: 


(1) Axioms for L. 

(2) Vx (pe ¥)> (Qxe oQxy). 
(3) Qxe(x)Qy ety). 

(4) Qxx =x. 

(5) Qxx#F x. 

(6) Qxg hQxp—Qx (gp aw). 

(7) Vy Qx p(x, y)—> Qx Ay e(x, y). 


The last axiom says that ‘‘definable unions of open sets are open’’. SGRo 
[1976b] and GaRAVAGLIA [1975] have gone on to study stronger logics (the 
next step is quantifiers on two variables), and have applied L(Q.,.,) to 
topological groups. 
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Introduction 


The ultraproduct construction is an algebraic operation whose impor- 
tance derives from its model-theoretic properties. The algebraic character 
of the construction makes it a very attractive (though not essential) tool to 
employ in giving an account of applications of model theory to algebra. 

This paper is a survey of the basic properties of ultraproducts and of 
some of their applications to algebra. The prerequisites for reading the 
paper are familiarity with the fundamental definitions of model-theory 
given in the previous chapters, and with the definitions of ‘“‘category”’, 
“functor”, and ‘‘natural transformation”. Moreover, since the applica- 
tions to algebra are taken from such diverse areas as group theory, ring 
theory, algebraic geometry, universal algebra and algebraic number 
theory, we have found it necessary to assume an acquaintance with the 
algebraic notions and results used in the examples. (Usually we have given 
references for definitions and results not found in LANG [1971].) However, 
the reader may skip over a particular algebraic example without any loss of 
understanding of later material (except possibly of related algebraic 
examples). 

The paper is divided into three parts. The first part, Basics, gives the 
definitions of ultrafilters and ultraproducts (as well as of a generalization of 
the ultraproduct called the reduced product) and proves the “fundamental 
theorem of ultraproducts’’. In the last section of the first part we discuss the 
functorial properties of ultraproducts. The rest of the results are given in 
two parts, Compactness and Saturation, whose titles refer to the key 
model-theoretic properties of ultraproducts used in proving these results. 

Some of the deepest applications of ultraproducts to algebra are only 
mentioned here, but references are given. Also, because the emphasis in 
this paper is on algebraic applications of ultraproducts, we have not 
discussed at all many interesting results about ultraproducts which do not 
fall under this heading. For more about ultraproducts we refer the reader 
to the excellent survey articles by KEIsLER [1965] and CHANG [1967] and to 
the textbooks CHANG-KEISLER [1973] and BELL and SLomson [1969]. 


BASICS 


1. Filters 


If I isa non-empty set, a filter over I is a set D of subsets of I such that: 
(i) 9 DIED; 
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(ii) if X, YE D, then XN YED; 

(iii) if X ED and X CYCI/, then YED. 

For example, if Y is a non-empty subset of J, the set {X CI| Y CX} isa 
filter over I, called the principal filter generated by Y; we denote it by (Y). 
If J is finite, every filter over I is principal (generated by [ {X | X € D}). 
If I is infinite, an example of a non-principal filter over I is the cofinite filter 
C ={X CI|I-X is finite}. 

Notice that by (i) and (ii), a filter D over I has the finite intersection 
property (FIP), i.e., the intersection of any finite set of elements of D is 
non-empty. Obviously any subset of D also has FIP. Conversely, if S is-a 
set of subsets of J which has the FIP, then S is a subset of a filter over J; in 
fact 


D={YCI|X,N-:-NX, CY forsome X,..., Xn € S$} 


is a filter containing S. 
A filter D over I is called an ultrafilter over I if for every X CI, either 
XE€ED or (I- X)ED. 


1.1. Proposition. A filter over I is an ultrafilter if and only if it is a maximal 
filter over I. 


Proof. Suppose D is an ultrafilter over I and suppose E is a filter over I 
such that D CE and D# E. Then X € E — D forsome X CI. Since XZ D 
we have I- X € D by definition of an ultrafilter. But then I- X € E, 
which is impossible since {X,I1— X}CE does not have the FIP. Con- 
versely, suppose D is a maximal filter and let X C/ such that X¢ D. Then 
D U{I — X}has FIP so it is a subset of a filter E. By the maximality of D, E 
must equal D andsol-xXED. O 


An application of Zorn’s Lemma yields the following. 


1.2. Coro.vary. If S is a set of subsets of I which has the finite intersection 
property, then S is contained in an ultrafilter over I. O 


1.3. Proposition. If D is an ultrafilter over I, XED, and X= 
Y,U-:-UY,, then for some i, Y; € D. 


Proof. If not, then by definition of ultrafilter, J — Y; isa member of D for 
all i=1,...,n. But then by (ii) of the definition of a filter, 6= 
XO(I- Y1)N:++:A(U— Y,) isa member of D, which contradicts (i). O 
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A principal ultrafilter is an ultrafilter which is a principal filter. If D is a 
filter over I and {x}€ D for some x € J, then D is a principal ultrafilter 
over I generated by {x}; in fact, YED iff YN{x}ED iff {x}cY. 
Conversely, if D is a principal ultrafilter, then D must be generated by a 
singleton; for, otherwise, if D is generated by Y and there exists 
0 # Y'GY, then the filter generated by Y’ is a proper extension of D, 
contradicting 1.1. O 


1.4. Proposition. For any ultrafilter D over I, D is non-principal if and only 
if D contains C, the cofinite filter. 


Proor. If D is a non-principal ultrafilter over I and X ={x,,...,x,} is a 
finite subset of I, then by the above remarks {x;}¢ D for all i=1,...,n. 
Hence I-{x,}€ D for all i=1,....n so O%.1-{a}=(1- X)ED. 
Therefore C C D. Conversely if C C D, then {x} ¢ D for all x € I (since 
I—{x}€C), so by the remarks preceding the theorem, D is non- 
principal. O 


1.5. Corotcary. Let S be a set of subsets of I such that X,N-:-NX,, is 
infinite for all X,,...,X,€S. Then S is contained in a non-principal 
ultrafilter over I. 


Proor. It follows from the hypothesis on S that C US has the FIP, so we 
can apply 1.2 and 1.4. 0 


2. Reduced products 


Let L be a first-order language; that is, L is a collection of relation, 
function, and constant symbols. We refer to 3.1 of Chapter A.1 for the 
definition of a model (or structure) 2 for L. (When we talk about specific 
algebraic systems we will sometimes follow algebraic custom and denote 
the model 2 by its universe A.) If X is a consistent set of sentences of L let 
M() denote the class of all models of & i.e. the class of all models Y for L 
such that every sentence of & is true in U, denoted U 2%. By an abuse of 
notation we shall write “(L) instead of M(); thus M(L) is the class of all 
models for L. 

If UY, Be M(L), a homomorphism yn from % to B is a set function 
71 : A — B which preserves all the relations, functions, and constants of L. 

For example, if L={+, =,0}(where + isa binary function symbol, < isa 
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binary relation symbol and 0 is a constant symbol) then 7 :%—>% is a 
homomorphism if and only if 7 (0x) = 7 (Os) and for all a,,a,E A 


(a. + a2) = n (a1) + n (a2); 


a.2a.> (a) = (a2). 


As usual, an isomorphism is a homomorphism which has an inverse 
homomorphism, and an embedding is a homomorphism which is an 
isomorphism onto its range. 

If (U,: iE I) is a family of elements of M(L) indexed by a set J, denote 
by II, %, the direct product of the family; that is I],%; is the model for L 
whose universe is the direct product I],A; of the sets A; and whose 
relations, functions and constants are defined ‘“‘componentwise’’. For 
example, if YU, =(Ai, +; <,0;), then II, ; = (I1.A, +, =,0) where 0(i) = 
0; for all i € J, and for any f, g ET,A, (f+ g)(i) = f(i) + :g (i); and fS g iff 
f@<ig@) for al iE 1. 

If (,:2€ I) is a family indexed by I and X is a subset of I we shall 
understand by IIx %; the direct product of the family (2; : i € X). Notice 
that if YC XCTI, the canonical projection mxy:IWxA;—IyA; is a 
homomorphism. 

Let D be a filter over a set J. Observe that if we define a partial ordering 
on D by XS Y @XZDY, then (D, =) is a directed set (i.e. for any 
X, Y € D there exists Z € D such that X <= Z and Y = Z). If (Mi: i € I)is 
a family of models for L indexed by JI, the reduced product of (,;:i € I) 
modulo D, denoted I1p%;, is defined to be the direct limit (in M(L)) of the 
directed system {xy : 1x %, Ty, |X, YE D,X < Y}. If D is an ul- 
trafilter, 1p 2%; is called the ultraproduct of (U,: i € I) modulo D. If MU, = % 
for all i € I, we write [1p % instead of [1p %, and call it the ultrapower of % 
modulo D. 

Although the shortest approach to the definition of reduced products is 
via the notion of direct limit, this approach is perhaps misleading since it is 
the concrete construction of the direct limit rather than its universal 
mapping properties which will be of importance in the sequel. So let us 
describe the structure of ITp%; more explicitly. The following description of 
the reduced product follows easily from the usual construction of direct 
limit, given the observation that for every X € D the projection 7x is 
surjective, so that every element of the direct limit is represented by an 
element of IT,A;. Those readers who are not comfortable with the notion of 
direct limits may take the following as the definition of reduced products. 

Define an equivalence relation =p on II,A; as follows: If f,g ETA, 
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then f=pg if and only if mx (f) = mx(g) for some X € D i.e. f=ng if and 
only if {i € I | f(i) = g(i)} is an element of D. The universe of Ip %; is the 
set I]pA; of equivalence classes of elements of I1,A;. Denote the equiva- 
lence class of f E II,A; by fo. 

The relations on IIp%; are defined by the condition that a relation holds 
between elements of Ip %; if and only if it holds on a set of components 
which belongs to D; similarly for functions. For example, if YW, = 
(A, +; =.,0:) and fo, go. hp € MpAi, then fp = gp in Ip Y; if and only if 
{i€ I| f(i)<:g()} € D. (Equivalently, fp < go if and only if there exists 
X € D such that mx (f) = mx (g) in Ix %;.) Similarly, fp + go = hp in pM, 
if and only if {i € I] f(i)+ g(i) = h(i)} © D. And the interpretation of 0 in 
Tip, is Op where {i € 1/0(i)=0,}€ D. It is easy to verify, using the 
properties of filters, that <, +, and 0 on I1p%; are well-defined. 

The following lemma justifies the similarity of our notation for direct 
products and reduced products. It is an immediate consequence of the fact 
that the directed set ((X), <) has X as a largest element. 


2.1. Lemma. If (X) is the principal filter over I generated by X, then the 
reduced product of (2, : i € I) modulo (X) is isomorphic to the direct product 
of (2; bE X) (i.e. Thx Ai = IxAi). 


It is ultraproducts rather than reduced products in general that are of 
most interest to logicians since ultraproducts preserve logical properties of 
the family (2,: i € I) (in a sense which is made precise in the main theorem 
of the next section). But before proceeding to that, let us mention that in 
the case when the models are division rings, an alternate construction of 
the reduced product may be given. 

Let (R;: i€ I) be a family of division rings. For any f €T/,R,, let 
Z(f) ={i € I| f(i) = 0}. Notice that for any f, g €I1,R, we have Z(f)C 
Z(g) if and only if there exists h, €TI,R, such that hf = g if and only if 
there exists h.€ I],R; such that fh. = g. (Here we use the fact that the R,’s 
are division rings.) It follows that every (left or right) ideal of I],R; is 
two-sided. If N is a subset of I],R, let Z(N)={Z(f): fe N}. 


2.2. THEOREM (KocuHEN [1961]). Let (Ri: i€ I) be a family of division 
rings. If N is an ideal of I1,Ri, then Z(N) is a filter over I. The mapping, Z, 
which assigns N to Z(N) is a one-one inclusion—preserving correspondence 
between proper ideals of N and filters over I. Under this mapping, principal 
ideals correspond to principal filters and maximal ideals correspond to 
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ultrafilters. Moreover, for any ideal N of I1,Ri, the quotient ring 1,R,/N is 
isomorphic to the reduced product MN zn)Ri. 


Proor. If S is a subset of J, let SEITI,R,; denote the ‘characteristic 
function” of S i.e. S: 1—>{0,1}C R such that $(x) = 0 if and only if x € S. 
Notice that if N is an ideal of I,R, and f € I,R,, then f € N if and only if 
Z(f)EN. It follows that for any f,g EN, Z(f)N Z(g)€ Z(N), since 


ZN Z(g)= Z(Z(Z(f) + Z(g))) + ZF) Z(g)). 


(A simpler expression is possible if none of the R; have characteristic 2.) 
Thus Z(N) satisfies (ii) of the definition of a filter; parts (i) and (iii) of the 
definition follow easily from the definition of Z(N) and the remarks 
preceding the theorem. Hence Z(N) is a filter. 

If D isa filter over I let Np = {f € T1,R; | Z(f) € D}; then N is an ideal 
since Z(f — g)D Z(f)N Z(g) and Z(fg)D Z(f); N is proper since P¢ D. 
The mapping which takes D to Np is inverse to the mapping Z and hence 
Z is a one-one correspondence between ideals of I1,R; and filters over J. 
Since Z is clearly inclusion preserving, maximal ideals correspond to 
maximal filters i.e. ultrafilters (see Proposition 1.1). It is also easy to verify 
that the ideal N is generated by f if and only if Z(N) is generated by Z(f). 
Finally, the function which takes the coset f + N in T,Ri/N to fo € NzawmRi 
is an isomorphism of rings. O 


2.3. Corottary. If (Rj: i € I) is a family of division rings and N is a prime 
ideal of T1,R,, then N is a maximal ideal and 11,Ri/N is a divison ring. 


Proor. To prove that N is maximal, it suffices to prove Z(N) is an 
ultrafilter. Let X CL. If X€ Z(N) and I — X€ Z(N), then X and I — X are 
not in N. But X(1- X)= OE N, a contradiction. Therefore Z(N) is an 
ultrafilter. To see that II,R;/N is a division ring, let f € Il,R; such that 
fEN. Then Z(f)Z Z(N) so I1- Z(f)E Z(N). If’ g E0,R, such that 
g(i) = f(i)' for alli € I— Z(N), then {i € I| g(i)f(i) = JJ=I-Z(P)EN. 
Therefore by the remarks preceding Theorem 2.2, 1 — gf € N, i.e. f isa unit 
in IL,R/N. O 


3. The fundamental theorem 


As a result of Corollary 2.3 we know that an ultraproduct of division 
rings is a division ring (while a reduced product of division rings which is 
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not an ultraproduct will not be a division ring). This is a special case of the 
fundamental theorem of ultraproducts which says that elementary proper- 
ties of the family (21;: i€@ I) are preserved under ultraproducts. More 
precisely, the fundamental theorem, or Kos’ Theorem, is the following. 


3.1. THEoREM (Los [1955]). Let (;: i€ I) be a family of models for L 
and let D be an ultrafilter over I. Then for any formula ¢(x;,..., Xn) of L and 
any elements g',...,g" of T1,%,, 


MMF g[gp--- ga) 
if and only if 
{iE T|WRe[g'i)---g"@PHED. 


Before proving the fundamental theorem, let us consider some conse- 
quences of it. For the case when ¢ has no free variables we obtain the 
following. 


3.2. CorOLLaRY. For any ultraproduct [1p%, and any sentence ¢, Hp, F @ 
if and only if {iE T|M, EF e}e D. 


3.3. Corottary. If YI, is in M(X) for each iE I, then the ultraproduct 
Hp, is in M(X). 


A subclass of (L) which is of the form (2) for some set of sentences 
» of L is called (first-order) axiomatizable, or elementary. Corollary 3.3 
says that any axiomatizable class is closed under ultraproducts. For 
example, an ultraproduct of groups (semigroups; rings; commutative rings; 
fields; algebraically closed fields; division rings; formally real fields; real 
closed fields; Lie algebras; Boolean algebras; etc.) is a group (semigroup; 
ring; commutative ring; field; algebraically closed field; division ring; 
formally real field; real closed field; Lie algebra; Boolean algebra; etc.) 
because the class in question is axiomatizable. Proving that a class is not 
closed under ultraproducts is a useful method of proving that a class is not 
axiomatizable. We give an example. A different proof is given in 2.3 of 
Chapter A.1. It is instructive to compare the two proofs. 


3.4. Corottary. Let L={+,0} be the language of abelian groups (i.e. L 
has a binary function symbol, +, and a constant symbol-0). The class of 
torsion abelian groups is not first-order axiomatizable. 
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Proor. If suffices to produce an ultraproduct of torsion abelian groups 
which is not torsion. Let J = w and let A, be the cyclic group of order n + 1 
for each n € I. We claim that if D is a non-principal ultrafilter over J, then 
IIpA, is not torsion. Let f E 11,A, be such that f(n) has order n+ 1 for 
each n€ I. For any m >0, if ¢,,(x) is the formula 


(x+x+-+++x=0) 
(where x occurs m times), then by 3.1, pA.F ¢n[fp] since 
{n €1|A,F gm[f(n)}} 


is a finite set and hence is not in D by 1.4. Therefore fp is not of finite 
order. (Instead of appealing to 3.1 we could verify directly from the 
definitions that mfp#0 in IlpA, for any m#0.) Hence I[IpA, is not 
torsion. [ 


This method of proving that a class is not axiomatizable will not always 
work since — as we shall see in Section 4— there are classes closed under 
ultraproducts which are not axiomatizable. See Section 5 for an ultra- 
product method of proving that a class is not finitely axiomatizable. Now 
let us prove 3.1. 


PROOF OF THEOREM 3.1. For the sake of concreteness we shall prove the 
theorem for our standard example, %; = (Ai, +, =;,0;), but the argument 
is perfectly general. The proof is by induction on the formula ¢. Consider 
first the simplest kinds of atomic formulas: x, = x2; x, + x2 = x3; or x, = 0. If 
gy is one of these, then the result follows from the definition of the 
ultraproduct. 

The most genera] types of atomic formulas are ¢, = t, and ¢; = t, where ¢, 
and ¢, are terms built up from the function symbol + and the constant 
symbol 0 [e.g. t, = ((x, + x2) + (0 + x3) + x1)) + ((x4 + x2) + 0)]. In this case, 
the desired result follows from the first case above once we prove, by 
induction on the construction of a term t(x,---x,), that tnpx,[gb°** gb] = 
hp if and only if {i € I | tu, [g'@)--- g"(@] = AC} € D. We leave the details 
to the reader: the initial cases (t = x, + x.,3 t = x;; or tf = 0) follow from the 
definition of the ultraproduct. (See 3.7 in Chapter A.1 for the definition of 
ta, there written 1”.) 

Now suppose » = 7 u(x,-:+x,) and suppose the theorem proved for w. 
Then 
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© fE1[Wey[g'(i)---g"@IED 
@ (iE1[/UWK vlg')---g"@DHED 
© {EI/W%ee[g(i)---g"@HED 


(The next to last equivalence uses the fact that D is an ultrafilter, not 
simply a filter.) If g = A w2, and the theorem is assumed for yw, and yg, a 
similar argument proves the result for g; here, the key fact is that for any 
filter D, XN Y € D if and only if XE D and YE D. 

Finally, if ¢ = Axop(x,--x,), then Ip Ui Fg [gb°:: gb] © there exists 
g°ET1,%, such that pW, F &[g%, go,..., 5] & there exists g°€ 11, %; 
such that 


{FE 1|WeEd[g(i)g'-. g°@HED> 
> (FE T[MFele')---g"@PHED. 
Conversely, if {iE 1[% Fe [g'(i)-:-g"()]}} = X © D, then there exists 
g°€II,A, such that for every i € X, AF w[g(i), g'(i),....g"()]; hence 
{iE T|/WE Ye), 2'@,...e°@ODHED. O 


Because of the importance of the fundamental theorem, from now on we 
shall confine our attention to ultraproducts (and ultrapowers) rather than 
arbitrary reduced products. 


4. Ultraproducts as functors 


Thoughout the paper we shall assume that all the classes of models we 
consider are closed under isomorphism. If & is a subclass of (L), we shall 
also denote by & the category whose objects are the elements of and 
whose morphisms are all the homomorphisms between elements of #. We 
shall also consider the category ', whose objects are the indexed families 
(%;: i€ 1) of objects of & and whose morphisms from (2,: i € I) to 
(8;: i€@ I) are the indexed families (y;: i€ J) of homomorphisms 
mn : 2, > B;. Since these are the only categories we consider, we shall freely 
interchange the words “‘class’’ and “‘category”’ in referrring to # or @". 

If (my: i€ 1) is a family of homomorphisms 7;:%;,—>%,, let 
Hon : Mp %; ~ Mp8; be the function defined by (Ilp7:) (fo) = go, where 
g(i)=n(f(i)). It is easy to check that IIpym, is a_ well-defined 
homomorphism called the ultraproduct of the n,; modulo D. 

If n = 7: UX— B for all i € J, then pn, is denoted by IHpn and called 
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the ultrapower of n modulo D; it is a homomorpnism from Ip to 1p %. It 
is now a routine exercise in the definitions to verify the following result. 
The notion of elementary embedding is defined in Chapter A.2. 


4.1. THEOREM. If D is an ultrafilter over I and & is a subclass of M(L) 
closed under ultrapowers (resp. ultraproducts), then the ultrapower (resp. 
ultraproduct) Ip is a functor from of to A (resp. from A' to A). 


4.2. THEOREM. The ultrapower functor on & preserves embeddings and 
elementary embeddings. Similarly the ultraproduct functor takes a family 
(ni: i€ 1) of embeddings (resp. elementary embeddings) to an embedding 
(resp. elementary embedding ). 


Proor. If suffices to prove the result for ultraproducts. Now by definition 
n 2M; > B; is an embedding iff 7, : %, > ran 7; is an isomorphism. Thus if 
{ni: i€ I) is a family of embeddings, Mp7; is an embedding because 
functors preserve isomorphisms and the range of Ilpy; = Ip ran 7). For the 
case of elementary embeddings we appeal to the fundamental theorem 
(3.1). Suppose y(x,---x,) isa formula of L and gp,..., g>€ Hp %.. Then 


MWK eleh-- sso FE L/WE ole (i)---g"(HE D 
OFET[BE e[nlg'@W)---n (gE D 
[—) IIp GB; F g ((pni)(gb)- . ‘(Ibn )(g5))- O 


One useful aspect of the ultraproduct construction is that—for a fixed 
ultrafilter—it provides a uniform way of defining functors on different 
categories. In order to give a formal expression of this uniformity we need 
a notion of ‘‘forgetful functor’. Let L, L’ be first-order languages such that 
LCL’. Recall that a model for L’ is a pair %' = (A’, #') where #' assigns an 
interpretation of each symbol of L’. If %’ is a model for L’, let Rui(’) be 
the model % = (A', #) for L where ¥ is the restriction of ¥’ to the symbols 
of L. Rii(M’) is called the reduct of I’ to L. If 7: %—>B’ is a 
homomorphism of models for L’, Rii(m) is the same set function 4 
regarded as a homomorphism of R,..(%’) to Rii(B’). It is easily seen that 
R,. is a functor from (L’) to M(L) called the forgetful functor. (We shall 
drop the subscripts on R where, in context, there is no ambiguity.) For 
example, if L is the empty set of symbols and LCL’, then (L) is the 
category of sets and R. is the familiar underlying set functor from “(L’) 
to M(L). 
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If Ry. takes the subclass ' of M(L) to the subclass of (L) we shall 
also denote by Ri. the restriction of Rui to & and call it the forgetful 
functor from A' to f. For example there is a forgetful functor Ri. from the 
class of ordered abelian groups to the class of abelian groups. (Here 
L={+,0} and L’={+, =,0}.) 

The next result is now an easy consequence of the definitions. 


4.3. THEorem. Let D be an ultrafilter over I and let L and L' be first-order 
languages such that LC L'. Then the following diagrams (for the ultrapower 
and ultraproduct respectively) are commutative. 


Ri us 
M(L')——*> M(L) ML"! Ses (Ly! 
Nl, Ip No fp 
M(L)—— > &(L) M(L') —p- > M(L) 


Theorem 4.3 is useful in proving that certain classes are closed under 
ultraproducts, even when we do not know if the class is axiomatizable. Let 
us say that a class @ C M(L) is pseudo-elementary if there is a forgetful 
functor R = R,, such that of is the image under R of an elementary (i.e. 
axiomatizable) class #'C M(L’). 


4.4. Coro.Lary. A pseudo-elementary class is closed under ultraproducts. 


Proor. Maintaining the notation of the definition, let us prove that for any 
family (%,: i € J) in of, Hp; isin &. By hypothesis, each 1; is of the form 
RU) for some YW; in &’. But then by Corollary 3.3 and Theorem 4.3 we 
have 


MW, = oR) = RM UWE R(S')CH. O 


4.5. THEOREM. Let n€w. If & is the class of all groups which are 
isomorphic to GL(n, F) for some field F, then & is closed under ultra- 
products. 


Proor. Here we use GL(n, F) to denote the group, under multiplication, 
of the invertible n X n matrices over F. By 4.4 it suffices to prove that & is 
pseudo-elementary. Expand the language L={-,e} of groups to L’= 
{-,e, +,*,0, 1, a;} where +,* are binary function symbols, 0 and 1 are 
constant symbols, and 7, 1 = i, j =n, are unary function symbols. Let ' 


cH. A.3, §4] ULTRAPRODUCTS AS FUNCTORS 117 


be the class of all models Y= (A,-,e, +, *,0,1, 7) for L’ satisfying the 
following properties (which are easily seen to be expressible as sentences in 
L’): 

(i) two elements a,, a2 of A are equal if and only if 7(a:) = 7,(a2) for 
all i,j (hence elements of A ‘are’? n X n matrices); 

(ii) the union F of the ranges of the 7, forms a field with respect to 
+,*,0, and 1; 

(iii) the “‘matrix” e is the identity matrix; 

(iv) the operation - is matrix multiplication; and 

(v) the elements of A are precisely the n X n matrices over F which are 
invertible. 
Then o& is the image of ’ under the forgetful functor. O 


The class sf in the above theorem is known not to be axiomatizable. (See 
SABBAGH [1969]; the proof is given for n = 1, but it generalizes to arbitrary 
n.) It is an open problem whether the class in the next theorem is 
elementary or not. 


4.6. THEOREM (Sabbagh). The class of primitive rings is closed under 
ultraproducts. 


Proor. For our purposes, the most convenient definition of a primitive 
ring is that R is primitive if and only if there is a maximal regular right ideal 
p in R such that (9: R)=(0) (see HERsTEIN [1968], p. 40). With this 
definition it is not hard to see that the class of primitive rings is 
pseudo-elementary. L 


We conclude this section with one other useful result about ultrapowers. 


4.7. THEOREM. Let D be an ultrafilter over a set I. There is a natural 
transformation d:I— Ip, from the identity functor on M(L) to the ul- 
trapower functor on M(L) such that for each % in M(L), d(%l) is an 
elementary embedding. Moreover, for any expansion. L’ of L we have 
Rird = dRi. 


Proor. For any % in M(L) let d(%):%— Mp be defined by d(a)= dp 
where @:1—% is the constant function with value a. That d(%) is an 
elementary embedding follows from the fundamental theorem, 3.1, since 


Ip Wk e[d(a.)::-d(al eo {iE T/M, =A e[a.---a,]J}E D 


© WE ela: an). 
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We leave to the reader the easy verification that d is a natural transforma- 
tion (i.e. d(B)of =[Mp(f)od(Q) for any f:%—-%B) and that Riid = 
dR. oO 


COMPACTNESS 
5. An ultraproduct version of the Compactness Theorem 


The fundamental theorem implies that for any family (%,: i € I) and any 
ultrafilter D, if each %; satisfies a set of first-order sentences 3, then Ip; 
also satisfies 3. The content of the next theorem (which is an easy corollary 
of the fundamental theorem) is that for certain ultrafilters D, the ultra- 
product Il, %; may satisfy a set of first-order properties } though no %; isa 
model of 3%. 


5.1. THEOREM. Let (%,: i € I) be an indexed family of models for L and let 
> be a set of sentences of L. 

(i) If for every sentence 9 of 3, {iE I|%;, g} is cofinite, then for any 
non-principal ultrafilter D over I, Np%; is a model of 3. 

(ii) If for every finite subset {1,...,@n} of 3, {i ET| AF Qia--- Aga} is 
non-empty, then there exists an ultrafilter D over I such that 1p %; is a model 


of %. 


Proor. (i) By Proposition 1.4, if D is non-principal, then {i € J |%,F ¢} is 
in D, for every g € 3, and therefore, by Corollary 3.2, Up %,F ¢. 

(ii) For each finite subset F of 3, let Ip = {i € 1| U;,- F}; I- is non-empty 
by hypothesis. Notice that S = {I-|F is a finite subset of }} has FIP since 
Ir,Q+++A I, DIg where G = Uj_, Ff. Thus by Corollary 1.2, S$ is con- 
tained in an ultrafilter, D, over I. Since I,,,€ D, it follows from Corollary 
3.2 that p%,F¢@ for every pEX. O 


As a corollary of the theorem we may obtain an algebraic proof of the 
Compactness Theorem discussed in the previous chapters. 


5.2. Coro.iary. If & is a set of sentences of L such that every finite subset 
of X has a model, then & has a model. 


Proor. Let I be the set of all finite subsets of Y and for each i € J, let %; 
be a model of i. It is clear that the hypothesis of 5.1(ii) is satisfied, and 


cH. A.3, §5] AN ULTRAPRODUCT VERSION OF THE COMPACTNESS THEOREM 119 


therefore there is an ultrafilter D over I such that IlpX; is a model of 
> 0 


Most, if not all, of the algebraic applications of ultraproducts which are 
given in this and the next two sections can be proved without use of 
ultraproducts by using instead Corollary 5.2, the Compactness Theorem. 
Many of them were first proved in this way. However, it is the goal of this 
paper to show how ultraproducts may be used to give more ‘‘algebraic”’ 
proofs. Thus our proofs are more-or-less direct applications of Theorem 
5.1 or Theorem 3.1 and even the appeal to these general theorems can be 
replaced in specific applications by a direct verification from the definition 
of ultraproduct of the desired properties (cf. the remark at the end of the 
proof of 3.4). 


5.3. THEOREM. Let P be a infinite set of primes and for each p € P let F, be a 
field of characteristic p. If D is a non-principal ultrafilter over P, then 11 pF, is 
a field of characteristic zero. 


Proor. Let 2 = {¢, | n = 1} where ¢, is the sentence ‘‘n - 1 4 0”. For each 
n, {p © P| F, © g¢,} is cofinite since it is the set of primes in P not dividing n. 
Hence by 5.1 (i), NoF, is a model of ¥ i.e. pF, has characteristic zero. O 


A class is finitely axiomatizable (in L) if it is of the form (2) where & is 
a finite set of sentences of L; notice that in this case we may assume — by 
forming the conjunction of the sentences in 3 —that & consists of a single 
sentence. If #@ = M({6}) is a finitely axiomatizable subclass of (L) then 
the complement of #& in M(L) is finitely axiomatizable — it is ({— 6})— 
and hence, by 3.3, closed under ultraproducts. Since by 5.3, the class of 
fields of non-zero characteristic is not closed under ultraproducts we obtain 
the following. 


5.4. CoroLLarRyY. The class of fields of characteristic zero is axiomatizable 
but not finitely axiomatizable. 


Similarly we can prove the following. (For a proof and other examples of 
non-finite axiomatizability, see Chapter 5, §3, of BELL and SLoMSoN 
[1969].) 


5.5. THEOREM (Tarski). The class of algebraically closed fields is ax- 
lomatizable but not finitely axiomatizable. 
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5.6. THEOREM. If o is a sentence of the language of fields which is true in 
every field (resp. every algebraically closed field) of characteristic zero, then 
there is a finite set of primes P, such that ¢ is true in every field (resp. every 
algebraically closed field) of characteristic p if p€ P,. 


Proor. Suppose not. Then there is an infinite set of primes P such that for 
each p € P there is a field (resp. algebraically closed field) F, which is a 
model of — ¢. By 5.3 and 3.2, if D is a non-principal ultrafilter over P, pF, 
is a field (resp. algebraically closed field) of characteristic zero which is a 
model of — 4g, a contradiction. O 


We close this section with a definition which plays an important role in a 
very deep result of Ax [1968] that the theory of finite fields is decidable. 
Let & be the set of all sentences of the language of fields which are true in 
every finite field. An infinite model of & is called a pseudofinite field. (Ax 
[1968] gives a purely algebraic description of these fields.) The following 
immediate corollary of 5.1(i) shows how pseudofinite fields may be 
constructed as ultraproducts. 


5.7. Coro.cary. Let (F,: n € w) be a family of finite fields such that for 
each m, {n € w| cardinality of F, <= m} is finite. Let D be a non-principal 
ultrafilter over w. Then IlpF, is pseudofinite. O 


6. Embedding theorems 


The following theorem is an abstraction of a, method frequently em- 
ployed in proofs involving ultraproducts (cf. KEGEL and WeHRFRITz [1973}), 
p. 66; also GRATZER [1968], Theorem 7, p. 261). 

A set {%; | i € I} of substructures of & is called a local system for 8 if: (1) 
B = U{B;|i € I}; and (2) for every i, j EI there exists k € J such that 
B, U B; C B,. For example, the set of all finitely-generated substructures of 
% is a local system for B. 


6.1. THEorEM. Let {8,|i€ I} be a local system for a model BE M(L). 
Suppose (M,: i € I) is an indexed family of members of M(L) such that for 
each i € I, there is an embedding (resp. homomorphism ) of ®; into X,. Then 
there is an ultrafilter D on I and an embedding (resp. homomorphism) of 8 
into Up %,. 
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Proor. Choose an embedding (resp. homomorphism) ¢; : 8; > YI, for each 
i€ I. For each j EJ, let S, ={i € 1|B, C B,}. Then S = {S,|j € I} has the 
FIP since §,9°°:OS, D> S, where k is such that B,U---UB, C By. 
Hence by Corollary 1.2, S is contained in an ultrafilter D over I. For each 
j EL, there is an embedding (resp. homomorphism) 7, : 8; > Tp, given 
by: g)(b) = bp where b(i) = £(b) if iE S, and b(i) is arbitrary otherwise. 
Now by definition of a local system, % is the direct limit of the direct system 
consisting of the 8,, i € I, and inclusion maps between them. The functions 
{n, |j € I} form a compatible family of maps with respect to this direct 
system and hence they induce a function 7 : 8 — Ip, which is easily seen 
to be an embedding (resp. homomorphism). O 


6.2. CoROLLARY. If & C.M(L) is closed under ultraproducts and if every 
finitely-generated substructure of 8 € M(L) is embeddable in a member of 
A, then © is embeddable in a member of #. O 


A group G is called a linear group of degree n if it is isomorphic to a 
subgroup of GL{(n, F) for some field F. The following is an immediate 
consequence of Corollary 6.2 and Theorem 4.5. 


6.3. THEOREM (MAL'cEv [1940]). Let G be a group such that every finitely 
generated subgroup of G is a linear group of degree n. Then G is a linear 
group of degreen. O 


For the sake of simplicity of exposition and proof we shall assume for the 
remainder of this section that our language L has a finite vocabulary, i.e. 
only a finite number of relation, function and constant symbols. If % is a 
model for L an equation over % is an expression of the form yo = f(y1° * yn) 
or R(yi':‘y,) where f (resp. R) is an n-ary function (resp. relation) 
symbol and each y; is a variable or constant symbol of L or an element of A 
(or, more precisely, a new symbol c, representing an element a of A). An 
inequation over 9 is the negation of an equation over %. If 9 C B we say % 
is algebraically closed (resp. existentially closed) in © if any finite system of 
equations (resp. equations and inequations) over %! which has a solution in 
% has a solution in %. We say 2 € o@ is algebraically closed (resp. 
existentially closed) in @ if 2% is algebraically closed (resp. existentially 
closed) in every extension which is a member of &. (A synonym for 
existentially closed is existentially complete. For more on these notions see 
Chapter A.4.) 
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6.4, THEOREM. Suppose C8. Then % is algebraically closed (resp. 
existentially closed) in B if and only if there is an ultrafilter D and a 
commutative diagram 


Tp Mt 


ae I\: 


acs 
such that y is a homomorphism (resp. embedding). 


Proor. For sufficiency, notice that 7 carries a solution of a finite system S 
of equations (resp. equations and inequations) with parameters from A 
into a solution of d($)—i.e. S with parameter a replaced by d(a)— in 
Tp. If b,...,b9 is a solution of d(S) in Tp>%, then by Theorem 3.1 
there is a (non-empty) set X in D such that for each i€ X, bi), 
...,b™(i) is a solution of S in &. (Alternately, one may use the fact (4.7) 
that d is an elementary embedding: a fortiori d(2) is existentially closed in 
Ip Wl.) 

For necessity, we first note that without loss of generality we may assume 
that L has no function symbols (by replacing functions by their graphs). Let 
I be the set of all pairs (F,, F.) where F, is a finite subset of A including 
the interpretations of all constant symbols of L, and F; is a finite subset of 
B - A. Given i = (F,, F2) € I, where, say, F, = {b,---5,}, let S be the set of 
all equations (resp. equations and inequations) in x,---x, with constants 
from F, which are satisfied in 8 when we let x; = b; for j = 1,..., n. Since S 
is finite, S has a solution x;=a,...,x, =a, in %. If 8; denotes the 
substructure of 8 with universe F, U F, (here we use the fact that L has no 
function symbols) then there is a homomorphism (resp. embedding) 
£,:B, > A such that ¢; is the identity on F, and £,(b;) = a, for j =1,...,n. 
Applying Theorem 6.1 to the local system {%,|i€ JI}, we obtain the 
desired homomorphism (resp. embedding) 7: 8—-Tlp%. O 


In similar fashion one can give necessary and sufficient conditions for 
to be an elementary substructure of 8 (see BELL and SLomson [1969], 
Chapter 8, §1). The following nice application of 6.4 is due to SABBAGH 
[unpublished] and Bacsicu [1972]. (Special cases were previously known; 
for example the case of group is due to NEUMANN [1952].) A model I € 
is called simple in & if every non-constant morphism in #& with domain Y is 
one-one. 
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6.5. THEOREM. Suppose & is closed under ultraproducts and suppose every 
member of & can be embedded in a simple member of #. If XI is 
algebraically closed in & and non-trivial (i.e. has cardinality = 2), then %I 
is existentially closed in & and % is simple in &. 


Proor. To prove the first part we must show that if %f is a submodel of 
% € &, then YI is existentially closed in 8. By hypothesis, we may assume 
that 8 is simple. By 6.4 there is a commutative diagram 


Tp 2 
dm) | \a 
ACB 


where 7 is a homomorphism. Now 7 is not constant since YI is non-trivial. 
Hence 7 is one-one and so by 6.4 YI is existentially closed in B. 

To see that % is simple, consider a non-constant homomorphism 
f :%— ©. Now % can be embedded in a simple model < and by 6.4 and 
4.7, we have a commutative diagram 


ene | pe een | 


‘. (vl) Vi 


aan ee 


where 74 is a homomorphism. Since d(&) is one-one, IIpf°7n is non- 
constant and therefore (since © is simple) one-one. Since the diagram 
commutes, we conclude that f is one-one. UO 


We conclude this section with a result about embeddings of rings due to 
A. Robinson and M. Rabin (see Rosinson [1962]). We could derive the 
result from Theorem 6.1, but it is simpler to give a direct proof. A ring R 
(possibly without an identity) is called a prime ring if for all non-zero a and 
B in R, a@RB is non-zero. 


6.7. THEOREM. Let & be a class of rings closed under ultraproducts. If R is a 
prime ring which is embeddable in a direct product of rings in 4, then R is 
embeddable in a member of &. 


Proor. Suppose R CII,A;, where A; € # for all i € I. For each non-zero 
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a in R let S, ={i€ I|a(i) 40}. Since R is a prime ring, S = {S. | a7 0, 
a €R} has FIP (because S,1 Ss DS..2 for some y€R). Thus by 
Corollary 1.2, S is contained in an ultrafilter D over I. Let A = IIpA;. By 
hypothesis, A € &. Define a map 7: R—A by n(a)= ap; then g is an 
embedding since for each non-zero a in R, S.ED. O 


The following corollary is due to RoBINsoNn [1962] for the case of division 
rings, and Amirsur [1967] for the case of primitive rings. See AmITsuR 
[1967], Herstein [1968], Chapter 7 or HiRsCHELMANN [1972] for generaliza- 
tions and an important application to Posner’s Theorem. 


6.8. CoROLLARY. If R is a prime ring which is a subring of a direct product of 
division rings (resp. primitive rings), then R is a subring of a division ring 
(resp. primitive ring ). 


Proor. The class of division rings is elementary, hence closed under 
ultraproducts. The class of primitive rings is pseudo-elementary, hence 
closed under ultraproducts (see Theorem 4.6). 0 


Some -more applications of Theorem 6.7 to the study of polynomial 
identities and rational identities of division rings can be found in Amitsur 
[1965 and 1966]. 


7. Bounds in polynomial ideals 


Throughout this section we shall be considering polynomial rings 
F[X,---X,] over a field F, where nis an arbitrary but fixed positive 
integer. A polynomial in F[X,: - -X,] will be written as f = 24cuX™ where 
M ranges over all n-tuples (m,---m,) of natural numbers and X™ is an 
abbreviation for X;'X2"---X;". The degree of M, denoted 5(M), is 
i.1m,; and the degree of f is max{5(M)| cu # 0}. 

Let (F;: i € I) be an indexed family of fields and let D be an ultrafilter 
over I. Let F denote the field I1pF, and for c € II,F, let € denote cp. Let R 
denote the ring Ip(Fi[X.---X,]). Now R is not a polynomial ring over 
F —it is not even Noetherian — but there is a canonical mapping 


wi F[X,---X,J>7R 


defined as follows. If f = 2aéuX™ © F[X,---X,] and degf=d, then 
uw (f) = fo where f@ = Lamy acu (i) X™. 
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An indexed family f =(f: i€ I) of polynomials f, € F, [X,--- ®,] is 
called bounded if there exists m such that deg f, = m for all i € I. We leave 
to the reader the task of checking the following. 


7.1. THEOREM. The function yu is a well-defined embedding of rings whose 
range is the subring of elements of R represented by bounded families. 
Moreover, for any f€ FIX,-: -X,] and any 4,...,d,€F. we have 
f(a. +++ d,) = 0, where v(i) = f(i) (ali) ++ an (i)). 


If f = (f,: iE I) is a bounded family, let f denote the unique element of 
F[X,---X,] such that u(f)=fp. As a consequence of 7.1 and the 
fundamental theorem we obtain the following. 


7.2, CoroLvary. Let f = (fi: iE 1), g°= (gi: GET), hh =(ho: iE 1) be 
bounded families of polynomials, for |= 1,...,r. Then 


F= Darn ig fiers-Serarlen. 
=1 t=1 


Moreover, for any Gy,...,d, in F, f(a,::-d,)=0 if and only if 


{iE T|fi(ali)---a.())= OED. O 
Let us consider some applications. 


7.3. THEOREM (ROBINSON [1955a]). Given positive integers n and d there 
exists a positive integer m such that for any algebraically closed field F and 
any polynomials f,g,...,.g° in F[X,---X,] of degree <d, if every 
common zero of g,...,g in F is a zero of f, then f™ belongs to 
(g,...,g), the ideal generated by g",..., 9. 


Proor. Note that the content of the theorem is that m can be chosen to 
depend only on n and d. Hilbert’s Nullstellensatz, which we use in the 
proof, asserts that given f, g“”,...,g°° satisfying the hypotheses, there is 
some m-—a priori depending on f,g“,...,g°’—such that f"¢€ 
(g,..., 2°”). If the theorem is false, then for each i € I = w there exists an 
algebraically closed field F,; and polynomials f, Bi vee Bs EF [X-° Xa] 
of degree = d such that f; € (gi”,...,g{”) but every zero of g{---g{ in F; 
is a zero of f,. (Note that we can assume that r does not depend on i since 
every ideal generated by polynomials of degree =d has a basis of 
cardinality =< the dimension (over the field of coefficients) of the vector 
space of polynomials of degree < d.) Let D be a non-principal ultrafilter 
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over I. Since (f,: i € I) and (gi: i€ J), l=1,...,r are bounded families we 
may consider f, g",...,g° € F[X,--- X,]. Every zero of g”,...,g in F 
is a zero of f by Corollary 7.2 since this is true for all i € I. Hence by the 
Nullstellensatz, f* € (g,..., g{°) for some k € w. Then by 7.2 we have 
{iE] ee (g{?,...,g/°)}}€ D. But this contradicts the choice of f, 
gi?,...,g{? and the tact that every element of D is infinite. O 


The following result, giving bounds for Hilbert’s basis theorem, is proved 
in a similar fashion. (See SEIDENBERG [1971] for a purely algebraic proof.) 


7.4, THEOREM. Given positive integers n and d there exists m such that for 
any field F, any strictly ascending chain of ideals in F[X,---X,] which are 
generated by polynomials of degree <d is of length <m. 


One can also employ these methods in order to obtain bounds for the 
number of squares required to represent a positive definite rational 
functional over an ordered field as a sum of squares (see, for example, 
RoBINson [1973b]). 

The following result about bounds requires a different mode of proof 
(see ROBINSON [1973a] for a discussion of the difference). It was first proved 
by Konic [1903] and HERMANN [1926] using complex computational 
methods. The following proof is due to Rosinson [1973a]. 


7.5. THEOREM. Given positive integers n and d there exists m such that for 
eas field F and any polynomials f,g,...,g° in F{X.---X,] of degree 
<d, if fe (g®,...,.g), then there exist polynomials h®,...,h© in 
F[X,---X,] of degree =m such that f = 2)-,hg”. 


Proor. By standard considerations of linear algebra it suffices to prove the 
theorem for algebraically closed fields. Suppose the theorem is false. Then 
for every i€ I=w there exists an algebraically closed field F; and 
polynomials fi, gi”, ..., 8h°E F[Xi--:X,] of degree <d_ such that 


fi E(g;,..-,8, ), but there are no polynomials h;,..., hf? of degree <i 
such that f, = 2). he gy. Let D be a non-principal ultrafilter over I and 
consider f,g,...,¢ € F[X,---X,]. Let G denote (g, ..., 2), the ideal 
generated by g“”,...,g in F[X,---X,]. If we can prove that f € G then 


we will be done since then there will exist A,...,A© in F[X,---X,] such 


that f = 2).,hg, and this, by 7.2, will contradict the choice of the 
qa) (r) 

fi, Bi yee Bi - z 
Since the case G = F[X,---X,] is trivial, we may assume that G = 
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Q,N-:-MQ, where the Q;’s are proper primary ideals (cf. ZaRiski and 
SaMuEL [1958], Chapter IV, §4). So it suffices to prove that f € Q, for 
j=1,...,k. Consider a fixed Q; = Q. By a change of coordinates we may 
assume that QC(X,,...,X,). (If (a:,...,@,) is a zero of Q in F, let 
Y, = X,-—a, for v=1,...,n.) Let P denote (X,,...,X,). By the Krull 
Intersection Theorem (Zarisk! and SAMUEL [1958], Chapter IV, Theorem 
12’), it suffices to prove that f€ Q+ P™ for every m =0. 

Now since f, € (g°”, ..., g{”), there exist polynomials p;’- -- p\” such that 
f= Zi-ipi?g,. Let po be the class of (p,’: i€ I) in R. Then fp = 
Vieipo Bo» ie. w(f) = Diep w(g). Fix m >0. Let qy be the class of 
(qi: iE 1) where q\? = ZacmyemCu(i)X™ if ps’ = Zucu(i)X™. Notice that 
qv belongs to the range of w; say qp = w(q"). Then 
q@) @). «) 


HF) — 2 eG) w (8) = D (po — qv) go. 
The left-hand side clearly belongs to the range of w; thus the right-hand 
side shows that it belongs to 4 (P™). We conclude that feQqtPp™ oO 


The following result is due to HERMANN [1926] (see also SEIDENBERG 
[1974]). It would be of interest to have an ultraproduct proof of this result 
in the spirit of the above, avoiding the computational methods of Her- 
mann. None is known at present. 


7.6. THEOREM. Given positive integers n and d there exists m such that for 
any ideal J of F{[X,--+X,] which has a basis consisting of polynomials of 
degree <= d, if J is not prime, then there exist polynomials f, g of degree =m 
such that fg © J but fE J and gE J. 


For an application of the methods of this section and Section 5 to the 
problem of resolutions of singularities in algebraic geometry, see EKLOF 
[1969]. 


SATURATION 
8. Ultraproducts which are w,-saturated 
In Sections 5-7 we used ultraproducts to construct models which satisfied 


given sets % of first-order properties. Now we want to study a property of 
ultraproducts (with respect to certain ultrafilters) called w,-saturation 
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which is not first-order but which is model-theoretic in character and is very 
useful in studying first-order properties of models. 

Recall from Chapter A.2 that if 2 is a model for L and X is a subset of 
A, then Lx denotes the language obtained by adding a constant symbol c, 
for each a€ X; and Wx = (2, a)aex is the model for Lx obtained by 
interpreting the new constant c, by the element a. Let ['(v) be a set of 
formulas y(v) of Lx which have v as the only free variable; '(v) is called a 
type in A, if for every finite subset {yo(v),..., ¥m(v)} of ['(v) there exists 
a €&A such that Ux & y,[a] for i =0,...,m. We say I is realized in Mx if 
there exists a € A such that %y & y[a] for all y(v) in P(v). A model % for 
L is called w,-saturated if for every countable subset X of A, every type in 
Mx is reanzed in Wx. In order to clarify the definition let us begin with two 
important examples. A linearly ordered set % is called an 7,-set (HAus- 
porrF [1914]) if whenever B, and B, are subsets of A of cardinality < w, 
such that B, < B, (i.e. for all b, € B,, b. © B., we have b, < b,), then there 
exists a € A such that B,< a < B, (i.e. for all b, E B,, b, © Bz. we have 
b, <a < b.). Obviously an 7o-set is just a densely ordered set without first 
or last element. 


8.1. Lemma. If % is an no-set which is w,-saturated, then % is an n,-set. 

Proor. Suppose B, and B, are countable subsets of A such that B, < B,. 

Let X = B,U B, and I'(v) be the set of all formulas of Lx of the form 
Co, <ovu< Cy, 


where b,€ B, and b,€ B>. Since % is an mo-set, [(v) is a type in Wx. 
Therefore ['(») is realized in Xx by some aE A., It is then clear that 
B,<a<B, O 


(The converse of Lemma 8.1 is true but we are not concerned with it 
here. In Section 10, we deal with the more general definition of w,- 
saturated models; 8.1 has a natural analog in that setting.) 


8.2. Lemma. If % is an infinite model which is w,-saturated, then % is 
uncountable. 


Proor. Suppose to the contrary that A is countable. Let ['(v) be the set of 
all formulas of L, of the form 


DF Ca 


cH. A.3, §8] ULTRAPRODUCTS WHICH ARE @;-SATURATED 129 


where a € A. Since A is infinite, [(v) is a type in 4%, and hence is realized 
in %,4, which is impossible. O 


We are going to prove that ultraproducts over the right kind of ultrafilter 
are w,-saturated; for this we need a new class of ultrafilters. An ultrafilter 
D over I is called w,-incomplete if there exist pairwise disjoint subsets Y,, 
of I such that U,., Y, =I but for all n€ w, Y,€ D. An w,-incomplete 
ultrafilter is clearly non-principal since for a principal ultrafilter there exists 
a€lI such that Y € D if and only if a € Y. The converse holds if I is 
countable. 


8.3. Lemma. Every non-principal ultrafilter on a countable set I is w,- 
incomplete. 


Proor. Say I ={a,: n€w}. Let Y, ={a,}. Then Y,€D since D is 
non-principal, but U,Y,= 1 O 


It is consistent with the usual axioms of set theory (ZFC) to assume that 
8.3 holds for sets I of any infinite cardinality. However it is an open 
problem of set theory whether 8.3 can be proved (from ZFC) to hold for all 
infinite sets I. It is known that if 8.3 fails for I, then J is extraordinarily 
large. The cardinality of the smallest such I is a measurable cardinal, as 
discussed in the anonymous appendix to Chapter B.3. 


8.4, Lemma. For any infinite set I there is an w,-incomplete ultrafilter over I. 


ProoF. Since I is infinite we can write 1= U,.. Y, where the Y,’s are 
pairwise disjoint and infinite. Let S = {I — Y,: n € w}. Then S has FIP so 
by Corollary 1.2, S is contained in an ultrafilter D which is w)-incomplete 
by construction. [J 


8.5. THEOREM. Let L be a countable language and let D be an w,- 
incomplete ultrafilter over I. For any family (4,: i € I) of models for L, the 
ultraproduct (p%; is w:-saturated. 


ProoF. Let X be acountable subset of pA; and ['(v) a type in (Ip YU.) x. 
Say ['(v) ={y,.(v): m © w}. (Notice that I’ is countable since L is count- 
able.) Let Y,, n € w, be subsets of I such that Y,¢ D but UnesY, =. 
Since ['(v) is a type in Ip %;, there exists for every m € w an element f$” 
of IlpA; such that MoM: F yalfo | for n =1,....m. Now define g € IA; 
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by g(i) = fi) if iE Y,,. We claim that Ip Wi  y,[gp] for all m € w. By 
Theorem 3.1 it suffices to prove that {iE J|WF yn[g(i)]} = Z, is in D. 


But Z,, contains I- UT, Y, = rc0(1—- Yn) and hence is in D since 
Y,¢ D for all n. O 


A first-order theory T is said to be model-complete if whenever AU and B 
are models of T such that % is a submodel of 8, then % is an elementary 
submodel of 8. For more on model-completeness see Chapter A.4. Here 
we simply want to show how ultraproducts may be used to prove that a 
theory is model-complete. 


8.6. THEOREM (Robinson). The theory of real-closed fields is model- 
complete. 


Proor. We shall assume the continuum hypothesis (CH). Suppose R, C R2 
are real-closed fields. Let p(v,-:-v,) be a formula of the language of fields 
and let a,...,a, € R, such that R,F ¢y[a,---a,]. We wish to prove that 
R.F¢@[a.-:-:a,]. We claim that we may assume that R, and R, are 
countable. By the Lowenheim-Skolem theorem (see Chapter A.2) there is 
a countable elementary submodel Rj of R, containing ai,..., a,, and there 
is a countable elementary submodel R: of R2 containing R;. Thus we have 
countable real-closed fields R{CR:, Rif ¢[a,...,a,], and we wish to 
prove that RiF g[a,...,a,]. This proves the claim. 

Let D be a non-principal ultrafilter on a countable set I. By Theorem 4.7 
there are elementary embeddings d(R,): R. > bR,, for v = 1,2. By 8.3 
and 8.5, IpR, is w,-saturated and by 8.1 IIpR, is an 7,-set (with respect to 
the unique ordering on a real closed field) for » = 1,2. By a result of 
Erpos, GILLMAN and HENRIKSEN [1955] any two real closed fields of 
cardinality &, which are 7,-sets are isomorphic, in fact any isomorphism of 
countable subfields extends to an isomorphism of the real-closed fields. 
Now by Lemma 8.2, Card(IIpR.)=N;. But Card(I]pR.) = Card (I1;R,) = 
2"°= €,. Hence Card (I]lpR,) = &, for v = 1,2. Therefore by the result cited 
above, if e:R,— R, is the inclusion map, there is an isomorphism 
@:WpR,;—TpR2 such that 

IIpR 1 ae eer NpR2 


a Ju 
e 


R, ————>|R, 


commutes. Hence 
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Rif @[ai---a,] & WoRiF ¢ [d(R,)(a1)--- d(R1)(a,)] 
© IpR2F ¢ [b(d(R,)(a1))- + (a(R) (4,))] 
© TbR2F ¢ [d(R2)(a,)- ++ d(R2)(an)] 


© R,F ¢[a---a.). O 


(Although we assumed CH in the proof if follows from general logical 
considerations — which we shall not give here —that the result actually 
holds without this assumption.) 

Let us take the opportunity to mention that 8.6 has important algebraic 
consequences; in fact RosBinson [1955b] (see also Rosinson [1973b]) 
showed how the solution to Hilbert’s seventeenth problem (solved origi- 
nally by Artin) may be derived easily from Theorem 8.6. See 2.3 in Chapter 
A.4. 


8.7. CoROLLARY (Tarski). The theory of real-closed fields is complete i.e. 
any two real-closed fields are elementarily equivalent. 


Proor. This follows immediately from 8.6 and the fact that any real-closed 
field contains a copy of the real closure of the rationals (cf. Theorem 6 of 
1.7 in Chapter A.4). O 


In similar fashion we may prove the following result, using the well- 
known theorem of Steinetz that any two algebraically closed fields of the 
same characteristic and the same uncountable cardinality are isomorphic. 


8.8. THEOREM (Tarski-Robinson). Let p be 0 or ua prime. The theory of 
algebraically closed fields of characteristic p is complete and model- 
complete. 


The continuum hypothesis is not needed in the proof of 8.8 because of 
the following result (see BELL and SLomson [1969], p. 130, for a proof). 


8.9. THEOREM. If (A;: i€ I) is a family of infinite sets and D is any 
w,-incomplete ultrafilter over I, then the cardinality of 11 pA; is at least 2”. 


We close with an interesting application of the methods of this and 
Section 5. The following is a special case of a theorem of Ax [1968]. 


8.10. THEOREM. Let F be an algebraically closed field and let V: F" — F™ 
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be a polynomial map, i.e. V(x1-++* Xn) = (Wi(Xi- + Xn), - +s Bm (X15 + Xn)) 
where h,,...,Wm are elements of F[X,---X,]. If WV is injective, then V is 
surjective. 


Proor. For a fixed n and m and fixed degrees for W,..., Wn the theorem is 
expressible as a first-order sentence in the language of fields. Hence by 
Theorem 8.8 it suffices to prove the theorem for one algebraically closed 
field of each characteristic. For characteristic p, a prime, the theorem is 
easily seen to be true by a simple counting argument for F,, the union of all 
the finite fields of characteristic p. The theorem is true for IIpF,, where D is 
a non-principal ultrafilter over P, the set of primes, by Corollary 3.2, and by 
Theorem 5.3, IIpF, has characteristic zero. O 


9. Ultraproducts of valued fields 


One of the most important applications of ultraproducts to algebra is the 
work of Ax and Kocuen [1965a, 1965b and 1966] and ErsHov [1965] on 
Artin’s Conjecture. The scope of this article dictates that we confine 
ourselves to giving an introduction to this work, sketching very briefly the 
role which ultraproducts play in the proof. We refer those whose appetite is 
whetted by this introduction to the original papers or to one of the many 
excellent expositions (e.g. CHANG and KEISLER [1973], KOcHEN [1975] and 
Rosinson [1969]). See also the discussion in 2.4 of Chapter A.4. 

Artin’s Conjecture states that the following property holds for all n and 
d when F = Q,, the field of p-adic numbers. 


(Ana) Every homogeneous polynomial f € F[X,--- X,] of degree 
d such that n > d’ has a non-trivial zero in F. 


The conjecture is is inspired by the similarities between Q, and Z,((t)), 
the field of formal power series over Z,, the field of order p. Lang proved 
that Z, ((t)) satisfies the property A,.4 for all n and d. Artin’s Conjecture in 
fact has been shown to be false for some p by TERJANIAN [1966]. However, 
Ax-Kochen and Ershov proved the following. 


9.1. THEOREM. For any positive integers n and d there is a finite set of primes 
P,(n, d) such that for every prime p € P.(n,d), Ana holds when F = Q,. 


At the heart of the proof of 9.1 is the following result which also gives a 
precise formulation of the intuitive analogy between Q, and Z,((t)). 
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9.2. THEOREM. For any non-principal ultrafilter D over P, the set of all 


primes, the ultraproducts pQ, and IpZ,((t)) are isomorphic fields. 


We leave to the reader the easy proof of 9.1 from 9.2. (Hint: use 
Corollary 3.2, Corollary 1.5, Lang’s result and the fact that for fixed n and 
d, property A,.4 is a first-order statement.) 


As in the case of 8.6 and 8.8, Theorem 9.2 is proved by using algebraic 
properties possessed by the ultraproducts which arise from their being 
w,-saturated. We shall not give the proof, which is considerably more 
sophisticated than those of 8.6 and 8.8, but we do want to indicate the key 
role played by w,-saturation. In fact Ax and Kochen prove that I]pQ, and 
IlpZ,((t)) are isomorphic as valued fields. Define a valued field (with 
cross-section) to be a model 


F =(F,+,:,0,1,A,5,| |) 


’ where (F,+,-,0,1) is a field; (A,-,1,<) is an ordered abelian group; 
| |: F—{0}— A is surjective and a valuation (i.e.|xy|=|/x||y|,J|x+y]s 
max{|x]|, |y|}); and |x|= x for all x € A. F is said to be w-pseudo- 
complete if for any sequence {a, |n € w} of elements of F such that 


(*) | @m — Gn | =| Qnsi— an | 
whenever n< m < , there exists a € F such that 

|a —an|=|@n+1— an | 
for all n € w. The crucial result is then the following. 


9.3. THEOREM. If F is an w,-saturated valued field, then F is w-pseudo- 
complete. 


Proor. If X ={a,|n€w} is a sequence in F satisfying (*)—such a 
sequence is called w-pseudo-Cauchy — let '(v) be the set of all formulas 


Yn(v): |v — ay | =| Anvi ~ An | 


for all n € w. F'(v) is a subset of Lx, where L is the language of valued 
fields. Now ['(v) is a type in Fx since for any m Ew 


Fy F Yn [@m +i] 


for n =0,...,m. Therefore there exists a € F such that 
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Fx F yn [a] 


for all n € w. (We say a is a pseudo-limit of X.) O 


10. Saturated models 


In the short space left to us we define the «-saturated and saturated 
models, which generalize the union of w,-saturated models, and mention 
without proof some important facts about these models. For more details 
and proofs see CHANG and Keister [1973], CHANG [1973] or BELL and 
SLomson [1969]. 

Let « be an infinite cardinal. A model % for L is called x -saturated if for 
every subset X of % of cardinality < x, every type in 2x is realized in % x. 
YW is called saturated if 2 is x-saturated where «x is the cardinality of A. As 
in Section 8 we can prove: 


10.1. Lemma. (i) If %& is an infinite model which is x -saturated, then % has 
cardinality =k. 
(ii) If % is an no-set which is w,-saturated, then % is an n,-Set. 


Models which are «-saturated can be obtained as ultraproducts, though 
the proof is more difficult than that of Theorem 8.5. The following theorem 
was first proved by Keister [1964] using GCH and by Kunen [1972] 
without GCH. 


10.2. THEOREM. Let L be a language of cardinality = x and let I be a set of 
power x. Then there is an ultrafilter D on I such that for any family 
(%;: i€ I) of models for L, the ultraproduct Np%; is x *-saturated. 


(The ultrafilters with the property given by the above theorem are called 
« *-good ultrafilters and were introduced by KEIsLER [1964].) As a corollary 
we obtain the existence of saturated models. (The hypothesis of GCH at « 
is essential.) 


10.3. CoroLLary. Let « be a cardinal such that 2“ = «*. If 1 isa model for 
L of cardinality <« (where L has cardinality <), then there is an 


elementary extension 8 of % of cardinality x* which is saturated. 


Proor. Let I and D be as in Theorem 10.2. Then S=IIp% is x*- 
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saturated. By Theorem 4.7, 8 is an elementary extension of 2. Moreover 
«* = Card(B) S 2“ by Lemma 10.1 and the definition of ultrapowers, so by 
hypothesis, Card(B) = «* and hence & is saturated. O 


A key property of saturated models is the following, due to Vaught (see 
Mortey and VaAuGut [1962]). It explains the motivation behind the proofs 
of 8.6 and 8.8. 


10.4. THEOREM. Elementarily equivalent saturated models of the same 
cardinality are isomorphic. 


The following immediate consequence of 10.2 and 10.4 was also proved 
directly without GCH by SHELAH [1971]. 


10.5. CoroLLary (GCH). Let % and 8 be models for the same language 
L. The following are equivalent: 

(1) % is elementarily equivalent to B. 

(2) There is an ultrafilter D on a set I such that 1p% is isomorphic to 


Np. 


Thus we see that elementary equivalence is an algebraic notion. 
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Prologue 


Abraham Robinson created the theory of model completeness, and 
directed its growth for over 25 years. In this survey I will record the growth 
of the subject, with emphasis on interactions with algebra. This is for me a 
welcome chance to honor the memory of a unique mathematician, and a 
kind friend. 


Introduction 


I begin with some historical and methodological remarks. 

The basic ideas of the subject can be located in the literature of the 
period 1950-1957. On the one hand, Rosinson [1951, 1956] was at work on 
model completeness of specific algebraic examples. On the other hand, 
Tarski and VauGuT [1957] developed some fundamental results about 
elementary extensions (then called arithmetical extensions). I am not aware 
of any occurrence of the idea in earlier work, although there are obvious 
links with the Léwenheim-Skolem Theorem and Tarski’s quantifier 
elimination for real closed fields. 

Why is it natural to study model completeness? 

1 think one can fairly say that elementary equivalence is the most 
fundamental concept in model theory. It is the analogue of isomorphism in 
general algebra. (One of the most striking results of the theory was the 
explication of elementary equivalence in terms of isomorphism of ul- 
trapowers (SHELAH [1971]). By referring only to elementary equivalence 
and the ‘“‘dual’’ concept of complete theory, one can formulate and obtain 
quite a few basic results, e.g. the Compactness Theorem, a weak 
Léwenheim-Skolem Theorem, and various important applications (for 
example, via ultraproducts, as in Chapter A.3). But already in the case of 
the Downward Lowenheim-Skolem Theorem, a close analysis would lead 
one to see that a stronger result was being proved, and that it was 
appropriate to introduce the notions of elementary substructure and 
elementary extension. 

Again, model theory is a little older than category theory, and was 
certainly more subtle in 1950, but the subject could have benefited early on 
from a category-theoretic approach. This would inevitably have led one to 
elementary maps, of which elementary extensions are a special case. 

It is in category-theoretic terms that one can best explain the advantages 
of introducing elementary maps. Tarski’s basic theorem (TARsSKI and 
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VAUGHT [1957]) tells us that if L is a first order logic, then the category of 
L-structures with elementary maps has direct limits, and that the same 
holds for the subcategory of models of a fixed L-theory. Another desirable 
feature of the category and the above subcategories is the amalgamation 
property (see Jonsson [1965]). All this becomes relevant in obtaining the 
deeper results of the period 1955-present. These ideas show up in the study 
of homogeneous-universal models (see, e.g., CHANG and KEIsLER [1973]), 
saturated models (CHANG and KeEIsLER [1973]), Morley theory (MoRLEY 
[1965]), automorphisms and indiscernibles (GAIFMAN [1967]). The system- 
atic use of Skolem functions in set theory (e:g. JENSEN [1972]) is a related 
phenomenon. 

. The final source of the idea, and the one most vital to Robinson, is the 
notion 

relatively algebraically closed in, 


coming from classical field theory. (Strictly speaking, a naive generalization 
of the above notion would give a concept weaker than <, because in the 
field theory case one was dealing with polynomials in one variable. But in 
interesting cases the two coincide.) Robinson abstracted from this source a 
notion of model completion of a theory, and proved that the model 
completion is unique if it exists. The canonical example was the theory of 
algebraically closed fields, as the completion of the theory of fields. The 
systematic use of these ideas by Robinson led to the nicest applications of 
model theory, to Hilbert’s 17-th Problem (Rosinson [1955]), and to Artin’s 
Conjecture (Ax and KocHEN [1965a], Ersov [1965]). Ax’s work (Ax [1968]) 
on finite fields is also in this line of development. In another direction, 
Robinson’s analysis led to remarkable progress in the study of differential 
fields (RoBiNsoN [1959a], Sacks [1972]). (It is however true to say that there 
are no applications of model completeness to problems in differential 
fields.) 

Much later, in 1969, Robinson gave the general theory new vigor when 
he linked it with forcing (BARWisE and RoBiNson [1970], Rosinson [1971]). 
In this way many new concepts appeared, and instruments for a finer 
analysis of previously intractable algebraic examples. New concepts of 
completion appeared, and some surprising connections with categoricity 
were established (SARAcINO [1973]). The notion of existentially closed 
structure was rescued from obscurity (EKLoF and SABBAGH [1971]), and a 
satisfactory theory emerged. 

The new applications can reasonably be described as negative results, in 
contrast to the work on Hilbert’s 17-th Problem and Artin’s Conjecture. 
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That is, the new results typically showed that some natural structures are 
extremely complicated (MACINTYRE [1972], HirsCcHFELD and WHEELER 
[1975]). But unquestionably the algebra and Robinson’s methods work 
nicely together. 

The final development to be discussed here is the connection between 
sheaves and model completeness results in ring theory. Here, a positive 
result of LipsHitz and SaArRAciNo [1973] and Carson [1973] inspired a 
transfer theorem (MACINTYRE [1973a], WEISSPFENNING [1973]). From this 
one can obtain an analogue of Hilbert’s 17-th Problem for real regular 
rings. 

The main topic omitted because of lack of space is the study of 
hierarchies for existentially closed structures, and the resulting definition of 
generic structures without using forcing (Simmons [1972b], [1973b], Hen- 
RARD [{1973]). This work is important, but has not till now made contact 
with algebra. 


1. Basic concepts and Robinson’s Test 


1.1. For the basic concepts of first order logic, I refer to Chapters A.1 and 
A.2. For material on ultraproducts, see Chapter A.3. 

I wish to keep notation as informal as possible, so, for example, I make 
no notational distinction between a structure and its underlying set. m will 
be a finite tuple (m,,...,m,), and if f is a map defined on each m,, then 
f(m) will be (f(m,),..., f(mx)). If ¢ is an L-constant, c™ is the interpreta- 
tion of c in M. 


1.2. Let L be a first order language. I associate with L a category ©, of 
L-structures. The objects of @. are the L-structures. The morphisms of €. 
are the monomorphisms of L-structures. That is, a map f: i> is a 
morphism iff 

(i) for each individual constant c of L, f(c™)=c”, and 

(ii) for each atomic L-formula ¢, and tuple m from We, 


(*) We p(m) iff RE e(f(m)). 


Obviously we get acategory €, in this way. Obviously €, has direct limits. 
If T is any L-theory, we have a corresponding subcategory €7, whose 
objects are the models of T. (Such a class is called an EC, class.) @r may 
not have direct limits. An easy example is got by taking T as the theory of 
ordered sets with first element. In fact, by the Chang—Los—Susko Theorem 
(Rosinson [1963]), @7 has direct limits iff T is an V3 theory. 
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A morphism f is elementary if (*) holds for all formulas g. Let Cr be 
the category consisting of L-structures with elementary maps, and let @7 
be the corresponding category of models of T. The most fundamental fact 
is: ‘ 


THEOREM 1 (TARsKI and VauGut [1957]). €¢ has direct limits. 
COROLLARY. @7 has direct limits. 


In plainer terms, the union of an elementary chain is an elementary 
extension of each member of the chain. 


Note. The preceding ideas and results go through in quite a wide setting. 
For example, the concepts, examples, and results all work for L..... The 
concept makes good sense for L(Q), Q a cardinality quantifier (see 5.5 of 
Chapter A.1) but the Tarski result would go through only for long chains. 
In general what is needed is a logic with securable or continuous quantifiers 
(Makowsky [1973]). The Tarski proof has nothing to do with compactness. 


1.3. We come now to another category-theoretic property of < , which this 
time does depend on compactness. The proof of Theorem 2 is a simple 
application of the Compactness Theorem (see 2.4 of Chapter A.1), and can 
be found in Simmons [1972a]. 


THEOREM 2. Suppose L is a first order language, and A, M,N are L- 
structures. Given morphisms 


AY— M 


| 


m 


in €:¢, we can complete the diagram to a commuting square in €(. 


Note that I do not claim that pushouts exist. But we can improve 
Theorem 2 a little, as Bacsich and Fisher showed me in 1971. Namely, in 
completing the above diagram, we can insist that f(m)A g(n) unless 
m =n €. See BacsicH and RowLanp-Hucues [1974]. 


1.4. There is an interesting characterization of elementary maps in terms 
of ultrapowers. This comes from the Keisler-Shelah Theorem (CHANG and 
KEIsLER [1973], SHELAH [1971]). 
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THEOREM 3. f:Dt—>M is elementary iff there exists an index set I, an 
ultrafilter D on I, and an isomorphism 


g:M'/DoeN'/D 


such that the diagram 
f 


v.—<—<—> N 


A A 
gD +n 7D 


commutes, where the A are the natural diagonal embeddings. 


1.5. Before arriving at the notion of model completeness, I have to 
introduce some weakenings of <. 


DEFINITION. f : Pt(— MN is an V,-map also written: 
mn, 
iff (*) of 1.2 holds for all formulas g which are prenex with at most n 
alternations of quantifiers, and with leading quantifier V. 
So <» corresponds to embedding, and <, is a natural generalization of 


relatively algebraically closed in. Obviously, f is elementary iff f is V, for 
each n. Now I state a key result. 


Lemma 4 (CHANG and KEIsLer [1973]). f: M—> MN is V, if and only if there 
exists on embedding g:3—> Wt* such that g of is elementary. 


The proof is a routine compactness argument. Y* can be chosen as an 
ultrapower of Dt, and gef as the diagonal map. 


1.6. DEFINITION. T is model complete iff every embedding (Wo-map) in @7 
is elementary. 


The next theorem is geared to applications. 


THEOREM 5 (Robinson’s Test). A theory T is model complete if and only if 
every embedding in €r is WV). 
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Proor. I give the proof mainly because it uses the method of alternating 
chains, which is the main method of the general theory. 

The proof of (>) is clear. To prove (©), suppose that Y%— M is an 
embedding in @r. So it is V,. By Lemma 4, I get 


M— KRM, 


<t 
commuting. Of course, Jt, F T, so I do the same for the pair Jt, Vt, to get 


RAB ee 
ee es 


Proceeding in this way I construct Dt., 2, with 
MRM, Ni --- SM, ON, Misi Meares, 


where the diagram commutes, and where M—> M,, N—> Ni, De, > Meas, 

MN, > N+, are all elementary. Form lim Mt, = lim %, = W, say. Then by 
. — —_— 

Tarski, the natural maps . 


M— YM, N— A 


are elementary, whence Y— MN is elementary. Therefore T is model 
complete. O 


1.7. In first applications, model completeness was used mainly as a tool to 
prove completeness, by using the Prime Model Test, whose proof is 
obvious. 


THEOREM 6 (Prime Model Test). If T is model complete, and there is a 
model It of T embeddable in all models of T, then T is complete. 


Example. T = theory of algebraically closed fields of characteristic 0. is 
the field of real algebraic numbers. Theorem 7 will tell us that T is model 
complete. So T is complete. 


1.8. Robinson’s Test is of great practical value, as I shall indicate later. By 
trivial reductions, the test comes down to showing that embeddings respect 
formulas Vy A (v, y), where A is a conjunction of atomic formulas and their 
negations. That is, if DiC M, Mt, R models of the theory in question, and 
ME Vy A(m, y), then NE VyA(m, y), whenever m is a tuple from M. 
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2. Applications: Nullstellensatz, Hilbert’s 17-th Problem, 
Artin’s Conjecture, and integral definite functions 


2.1. I wish to follow as much as possible the historical development of the 
subject. The main applications followed the material of Section 1, and were 
carried out between 1950 and 1970. After 1970 we entered a new period of 
theoretical development, whose applications (to be given in Section 5) have 
a different character. 

The applications in this section are to field theory. The inspiration is 
Robinson’s. What must we know to apply Robinson’s Test? 

Let T be a theory of fields. We have to know that if M@, NE T, MCN, 
then I<, M. By the reduction mentioned earlier, we must show: If is a 
finite system of equations and inequations over XM (i.e. with parameters 
from M), and & is solvable in M, then J is solvable in Me. 

In field theory, inequations can be replaced by equations, because of the 
equivalence 


a#0© dy(a-y=1), 


so we may assume & is a system of equations. But then our condition on 
+ begins to look like a Nullstellensatz, and Robinson liked and pursued this 
analogy. A weak version of Hilbert’s Nullstellensatz (LANG [1958]) says: 

If M is algebraically closed, and X has a solution in §X, then ¥ has a 
solution in 2. 

But this is just the data needed to apply Robinson’s Test. So from the 
Nullstellensatz we can derive the model completeness of the theory of 
algebraically closed fields. This result was first obtained by Tarski [1951] 
by quantifier elimination. Tarski did not use the Nullstellensatz. 

Conversely, we can always derive a Nullstellensatz for any model 
complete theory of fields. So in particular the Nullstellensatz given above 
for algebraically closed fields follows from Tarski s analysis. 

Rosinson [1951] also found a model completeness proof independent of 
the Nullstellensatz, and this style of proof became very popular. His proof 
goes thus. Suppose 2 C MN, where Mt, MR are algebraically closed fields, and 
not P<, N. Without loss of generality (because we are dealing with failure 
of V,-extensions) 3t has finite transcendence degree over Yt, and so without 
loss of generality 9% has transcendence degree 1 over Wt. Let {t} be a 
transcendence base for % over DM. By the assumption that not P<, MN, 
there is a quantifier-free formula ¢, and tuple m from Yt such that 
NE Av —(v, m) but ME 4 Av e—(v, m). Robinson made the basic observa- 
tion that 
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(S) Diagram (M(t)) U Theory of algebraically closed fields |t Jv ¢(v, m). 


(S) follows from Steinitz’s Theorem (JAcoBson [1964]) that there is a prime 
algebraically closed field containing Dt(t), namely 3. But from (S) by the 
Compactness Theorem one easily deduces that 


Me dv e—(v, m). 


See Rosinson [1963] for a full proof. 
So we see that from Steinitz we can deduce a Nullstellensatz. 
To summarize this subsection, let me say that we have proved: 


THEOREM 7. The theory of algebraically closed fields is model complete. 


We have outlined several approaches, and stressed the link to the 
Nullstellensatz. There are other proofs. One can use Steinitz and Theorem 
3. One can use Steinitz and Lindstrom’s Theorem (see Section 3). 


2.2. Robinson then applied his method to real closed fields (ROBINSON 
[1955]). In this case there is no familiar Nullstellensatz. However, using 
Tarski’s quantifier elimination method, one can obtain: 


THEOREM 8. The theory of real closed ordered fields is model complete. 


So one can deduce a Nullstellensatz, but this seems to have no special 
importance. For a proof of quantifier elimination see Tarski (1951], or 
better, CoHEN [1969]. 

Robinson obtained model completeness by using two facts, namely: 

I. There is a prime real closed extension of any ordered field. 

II. If K is real closed, then K(x) is determined as an ordered field by the 
cut x makes in K. 

His proof then goes smoothly along the lines of his proof for algebraically 
closed fields. (At the end one must use the trivial fact that ordered fields 
are densely ordered without last element.) 

Robinson’s proof does not give quantifier élimination directly. To 
supplement his treatment one would have to rely on his theorem (to be 
explained in Section 3) that the model completion of a universal theory has 
elimination of quantifiers. 

There is a later proof due to KocHEN [1961], by the ultraproduct method. 
Essentially, one identifies the N,-saturated real closed fields and proves an 
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isomorphism theorem from which model completeness follows. This proof 
uses the same facts Robinson used. One can also obtain quantifier- 
elimination thus, using an observation of SHOENFIELD [1971], or BLuM 
[1968]. 

(It should be clear that the methods are in some sense equivalent. This is 
obvious in the classical algebraic examples. But I wish to remark that there 
are instructive cases (ROBINSON [1959b], Ersov [1967]) where Kochen’s 
method seems easier than Robinson’s, and vice versa.) 


2.3. Hilbert’s 17-th Problem 

Now I come to a genuine application. The problem was: Suppose f is a 
positive definite rational fucntion in x,,...,x, over Q, or R. Is f necessarily 
a sum of squares of rational functions? 

The problem was sulved affirmatively by Artin in 1927 (ArtIN and 
SCHREIER [1926], ARTIN [1927]). To do this, they invented the theory of real 
closed fields, and proved the fundamental facts on existence and 
juniqueness of real closure. The main idea is a generalization of Sturm’s 
Theorem (JAcosson [1964]). It seems that no existing treatment, whether 
by logic or algebra, has avoided the use of Sturm’s Theorem. (The attempt 
to do so has led to serious errors in both camps.) Of course, COHEN [1969] 
gets by without making Sturm’s Theorem explicit (he uses the so-called 
Sign Change Property instead). But there is no known way to get directly 
from Cohen’s result to the theoretical facts needed for Hilbert’s 17-th 
Problem. (There was a_ beguiling possibility of using results of 
Morley-Shelah type (Sacks [1972]), on prime model extensions, to bridge 
the gap. But it seems that to verify the hypotheses of such theorems one 
needs Sturm’s Theorem.) 

Let us therefore assume the basic facts on existence of real closure. The 
logic proof will be shorter and more perspicuous than any proof disdaining 
the use of logic. (It is noteworthy that Jacobson’s new algebra text 
(Jacosson [1974]) uses the logic approach to real closed fields.) 

I now give Robinson’s solution of Hilbert’s 17-th Problem. Let K be an 
ordered field, with a unique ordering (Q, R are such fields). Suppose 
fe K(x,...,x,), and f is not a sum of squares. By a beautiful argument 
having nothing to do with logic, Artin and Schreier showed that 
K(x,,...,X,) may be ordered so that f <0. K is a subfield of K(x,,..., xn), 
so K inherits an order from K(x,,...,x,). By uniqueness this must be the 
original order on K. By the general theory we have a commuting diagram 
of embeddings 
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i K(x1,...., Xn) 
Real closure Real closure of 
of K K(x1,.--, Xn) 


But f <0in K(x,,...,x,), and so f <0 in the real closure of K(x1,..., Xn). 
So the real closure of K(x,,...,%,) satisfies 4x,+--> Sx, [f(%1,..., Xn) <0]. 
By model completeness, the real closure of K_ also satisfies 
Jx,--- dx, [f(%,...,x,)<O]. Assume now that K is dense in the real 
closure of K. Then 


KE 4dx,-+- 3x, [f(x1,..., Xn) < OJ. 


So f is not positive definite. 

We have answered the Hilbert problem affirmatively for fields K which 
have a unique order and are dense in their real closure. Of course, Q@ and R 
are such fields. Recently, MCKENNA [1975] building on work of Scott 
[1969] obtained the nice result that if K has a unique order, then Hilbert’s 
17-th Problem has an affirmative answer for K if and only if K is dense in 
its real closure. 

Robinson’s work described above remains one of the very best applica- 
tions of logic to algebra. Till now, it has been surpassed only by the work of 
Ax-Kochen and Ersov on p-adic fields and Artin’s Conjecture. Clearly, 
Robinson’s work inspired the later achievements. 


2.4. Artin’s Conjecture 

After the above work one has quite a clear project for future progress. 
The delay of one decade before progress was made can be explained neatly 
by a remark of BLum [1968]. Robinson’s application was ‘‘after the 
algebraic facts’”” — the work on p-adic fields would establish fundamentally 
important algebraic results. Accordingly, it required extensive algebraic 
preliminaries. 

It was natural to expect that the model theory of p-adic fields would 
resemble that of real fields. The most obvious similarities are 

(a) both R and Q, are locally compact fields; 

(b) for both R and Q, we have a viable topological-algebraic criterion 
enabling us to locate zeros of polynomials, namely, for R the Sign Change 
Property or Sturm’s Theorem, and for @, Hensel’s Lemma (Jacopson 
[1964]) or variants. 

Of course, (b) is of most importance, since this type of result for real 
closed fields is the cornerstone of the whole algebraic and logical theory. So 
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our project should be to find p-adic analogues of ordered field and real 
closed field, and to prove analogues of Facts I and II of 2.2, taking Hensel’s 
Lemma as our key idea. 

This is precisely what was done in the great advances made by Ax and 
KocueEn [1965a, 1965b, 1966] and Ersov [1965] from 1964 onwards. As in 
the case of Hilbert’s 17-th Problem, a remarkable feature of the method is 
that to prove an algebraic result about an individual field one has to prove 
model theoretic results about certain elementary classes containing this 
field. I will not enter into details. The main complications are algebraic, 
and model theory guides us through these complications. For the full story 
one can consult the above-cited papers. 

General setting. L is the natural language for valued fields. ¥ is a class of 
fields, and I a class of ordered abelian groups. U(¥, I) is the class of 
valued fields whose value group is in I” and whose residue class field isin ¥. 
UWu(F, 0) is the subclass of &(¥, I) consisting of the fields that satisfy 
Hensel’s Lemma. Obviously, if ¥, I are EC,-classes so is Un(¥, I). 

The theorem with the deepest application is: 


THEOREM 9. Suppose ¥ and I are EC, ciasses, each with complete theory. 
Suppose all members of ¥ have characteristic 0. Then Uu(F,I) has 
complete theory. 


This is not a model completeness result, though intimately related to 
such results. The algebraic analysis goes thus: 

(a) Show that every member of U(¥,I) has a unique immediate 
(KAPLANSKY [1942]) algebraic extension to a member of Uu(¥, I). This 
depends on the characteristic 0 assumption. This gives us the analogue of 
real closure. 

(b) Establish an analogue of Fact II of 2.2. 

There are various auxiliary complications, which we ignore here. Then 
the theorem is readily proved by ultraproducts or saturated models. 

Application of Theorem 9. For prime p, let F,((t)) be the field of formal 
Laurent series over the finite field F,. This is of course a valued field with 
residue class F, and value group Z. It satisfies Hensel’s Lemma. Most 
importantly, it is a C2 field, as proved by Lana [1952]. 

Let D be a non-principal ultrafilter on the set P of primes. Form 
per, ((t))/D. This is a valued field, with residue class field Il,-pF,/D, and 
value group Z’/D. It also satisfies Hensel’s Lemma, and is C,, by the 
Fundamental Theorem on ultraproducts. Its residue class field has charac - 
teristic 0. 
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Form Il,<¢r @,/D. This has the same residue class field and value group as 
the above ultraproduct, and it satisfies Hensel’s Lemma. By Theorem 9, 


I] F.(@)/D = IT @,/D, 


so I,crpQ,/D is Co. 

In some sense this says that the average Q, is C2. Artin’s famous 
conjecture said that each Q, is C2. From logic we get: For each n, d there is 
a prime q(n, d) such that if p > q(n, d) and f is a homogeneous polynomial 
over Q, in n variables and of degree d, and if n >d’, then f has a 
non-trivial zero in Q,. 

This turned out (TERJANIAN [1966]) to be the best possible result. 

We should note that CoHEN [1969] has a nice proof, along the lines of 
Tarski’s work on real closed fields. 

Ersov [1965] gives various interesting extensions of these methods to the 
study of C, fields. 


2.5. Model completeness of Q, 

We now face a slightly different situation. Q, has finite residue class 
field, although Q, itself is of characteristic 0. To obtain (a) for Theorem 9, 
one used work of KapLansky [1942]. This does not apply to @, and the 
fields elementarily equivalent to it. 

Let ¥ be the class {F,}, and I’ the class of Z-groups (the class of groups 
elementarily equivalent to Z (PRESBURGER [1929]); this class is model 
complete if we distinguish the least positive element 1). Then Q, is in 
Uu(¥, £). But so is F,((t)), and this is clearly not elementarily equivalent 
to Q,. We distinguish the two by the sentence 


v(p) = 1, 
true in Q,, false in F,((t)). 


Let UK(F, I) be the subclass of Uu(¥, I’) satisfying the above sentence. 
a(¥,I) is EC, whenever ¥ and I are. 


THEOREM 10 (Ax and KocueEN [1966], Ersov [1965]). Let ¥ = {F,}, and let 
I’ be the class of Z groups. Then the theory of Ui(F¥,I) is complete and 
model complete. 


There are several variants of the proof. I feel that the proof of Ersov 
[1965] via Robinson’s Test is rather natural. The original proofs used the 
extraneous notion of cross-section (Ax and KocuEN [1966]), and were 
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more complicated than need be. Cohen’s proof [1969], which is very 
elegant and direct, gives quantifier elimination for Q,, using the cross- 
section n— p”. 

The approach for the model-theoretic proofs is like that for Theorem 9. 
One establishes analogues of (a) and (b). This time one has to invent the 
algebra first (Ax and KocuEn [1965b], Lemma 8). 

The quantifier-elimination problem for UA(¥, IF) is rather interesting. If 
one brings the cross-section into the formal language then one has 
elimination of quantifiers (Ax and KocHEN [1966], ConHEen [1969], 
WEISSPFENNING [1971].) This can be done constructively or by the Shoen- 
field criterion cited in 3.2. But then it is very hard to figure out what are 
the definable subsets of Q,. This constrasts with the case of R, where 
Tarski [1951] gave a perspicuous and useful description of the definable 
sets. With this in mind, I gave (MACINTYRE [1974a]) a reformulation of the 
whole theory, in terms of predicates for the valuation ring and the sets of 
n-th powers, under which Q, has elimination of quantifiers. We then have 
the following: 


THEOREM 11 (K=R or Q,). Any infinite definable subset of K has 
non-empty interior. 


One may speculate also that this approach will give us explicitly a p-adic 
analogue of Sturm’s Theorem. 


2.6. The analogue of Hilbert’s 17-th Problem 

Now that we have model completeness for Q,, and the relevant facts 
about prime model extension, we can hope to get results like those in 2.3. 
Well, we need analogues of: 

(i) positive definite, 

(ii) sum of squares. 

An analogue of (i) comes to mind, namely integral definite. Suppose 
f€Q,(x,...,%). f is integral definite if v(f(a.,...,a,))=0 whenever 
v(a,)=0 for lsizsn. 

An analogue of (ii) is less obvious, and was found by KocHEN [1969]. He 
then modified each step of Robinson’s analysis, and obtained a remarkable 
analogue of the solution of Hilbert’s 17-th Problem. The result is too 
complex to be stated compactly, so we recommend the reader to go to 
Kochen’s paper. There is some possibility that Kochen’s analysis may 
enable us to understand the nature of the counterexamples to Artin’s 
Conjecture (TERJANIAN [1966]). 
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2.7. There are many other model completeness results along the lines of 
Theorem 10. The interested reader should look at Ersov’s paper (ERsov 
[1965]), as well as Z1EGLER [1972]. 


2.8. There is an intriguing open problem in this area. 
Problem. What is the elementary theory of F,((t))? 


The obstruction to progress is that we know no analogue of the (a), (b) of 
2.4. The only result in this direction is to be found in Coven [1972], using a 
result from Greenberg [1969]. 


2.9. Separably closed fields 

A field K is separably closed if it has no separable algebraic extensions. 
Ersov [1967] classified the elementary types of separably closed fields, by 
using Robinson’s Test. 

The key algebraic result needed (as we might expect) is a Nullstellensatz. 
This can be found in Lana [1958]. As far as I know, there are no 
applications of Ersov’s result. Further, no other proof has been obtained 
which might imply the relevant Nullstellensatz. 


2.10. Finite fields 

In 1967-1968 (Ax [1968]), Ax succeeded in analyzing the model theory 
of finite fields, and thereby solved positively the old problem of ‘the 
decidability of finite fields. Ax’s original method of proof was by saturated 
models, but as usual the main ideas are those needed for a proof in the style 
of Robinson’s Test. ‘ 

Ax essentially had to find axioms for ultraproducts of finite fields. The 
relevant axioms for such fields K are: 

(i) K is perfect, 

(ii) K has exactly one extension of each degree; 
and the all important 

(iit) every absolutely irreducible variety over K has a point in K. 
Axiom (iii) is deep. Note that of course (iii) is a sort of Nullstellensatz. On 
formal grounds one can expect (iii) to imply a weak form of model 
completeness. 

Ax found elementary invariants for the fields (so-called pseudofinite 
fields) satisfying (i)}-(iii). Later his students ADLER and Kiere [1976] found 
model completeness results (indeed quantifier elimination) for such fields, 
at the cost of adding some auxiliary predicates to the language. 


154 MACINTYRE / MODEL COMPLETENESS [cH. A.4, §3 


Recently, JaRDEN and KieHNE [1975] simplified Ax’s treatment, and 
extended his results. 

Again, there seems to be till now no application of Ax’s brilliant analysis. 
In Theorem 8.10 of Chapter A.3 there is a nice result which turned up in 
the course of Ax’s work. 


2.11. Decidability 

In all the above cases, one obtained decidability results by general 
nonsense once one had explicit axiomatizations. In all cases but 2.9 and 
2.10 primitive recursive procedures are known (CoHEN [1969]), and for 
finite fields such a procedure has been announced (FrieD and SACERDOTE 
[1975]). Moreover, the model-theoretic approach enables one to obtain 
effective bounds in the theory of polynomial ideals. For more on this see 
Chapter A.3. 


3. The general theory: Model completions, existentially closed structures, 
and forcing 


3.1. Early on, Robinson discerned a general pattern in the preceding 
examples. He isolated the notion of model completion of a theory, and 
proved some satisfying general results. Since 1970, largely due to the efforts 
of Robinson, his students, and associates, the general theory has been 
much refined, making possible new sorts of applications to algebra. 


3.2. Model completion 
This is a sort of dual to the operation which assigns to a field (resp. an 
ordered field) its algebraic closure (resp. its real closure). 


DeFinition. Let T, T* be L-theories. T* is a model completion of T iff 

(a) T and T* are mutually model consistent, i.e. every model of T is 
embeddable in a model of T* and vice versa; 

(b) T* is model complete; 

(c) If Pte T, then T* U Diagram(M) is complete. 
A moment’s reflection shows that clause (c) is related to the uniqueness of 
algebraic closure. Namely, it says that a model of T is embeddable in a 
model of T* ‘“‘in a unique way” (cf. fact I of 2.2). 

In time, around 1970, a weaker notion appeared. T* is a model 
companion of T if (a) and (b) hold. 


Example 1. T=theory of fields. T*=theory of algebraically closed 
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fields. That T* is a model completion of T follows from quantifier 
elimination for T* 


Example 2. T=theory of ordered fields. T*=theory of real closed 
fields. That T* is a model completion of T follows from quantifier 
elimination for real closed ordered fields. 


Example 3. T=theory of formally real fields. (No symbol for <.) 
T* = theory of real closed fields. T* is a model companion for T, but not a 
model completion (EKLoF and SABBAGH [1971]). The reason is that we do 
not have quantifier elimination for real closed fields unless we allow order 
as a primitive notion. 


Example 4 (Needed later). T = theory of Boolean algebras. T* = theory 
of atomless Boolean algebras. T* is a model completion of T. 


Example 3 suggests the easy: 


LemMA 12. (a) Suppose T* is a model companion of T, where T is a 
universal theory. Then T* is a model completion of T is and only if T* has 
elimination of quantifiers. 

(b) Suppose T* is a model companion of T. Then T* is a model 
completion of T if and only if T has the amalgamation property. 


The proof is related to Shoenfield’s criterion (SHOENFIELD [1971]). 


3.3. The most important early result was: 


THEOREM 12 (RosBINSON [1963]). A theory T has at most one model 
companion. 


The proof is by an alternating chains argument. I will shortly give a 
modern version of the argument, due to Simmons [1975]. But first we 
should observe that a theory may have no model companion. We will see 
interesting examples later. 

I write T* for the model companion of T, when this exists. 

What are the obvious properties of the partial map T» T*? 

For any T, let Ty (resp. Tva) be the theory whose axioms are the 
universal (resp. universal existential) consequences of T. By using the 
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theorems of Los—Tarski(RosINSON [1963]) and Chang-Los—Suszko (RosIn- 
soN [1963]), which are discussed in Chapter A.2, we easily get: 
(i) If T* is defined, so is (Ty)*, and T* = (Ty)*. 
(ii) If (Tv)* is defined, so is T*, and T* = (Ty)*. 
(iii) T* is an VA theory, and TysC T*. 
Results (i) and (ii) correspond to (a) of 3.1, and (iii) corresponds to (b). 
Note the obvious 


Lemma 13. If T is model complete, then T is WI. 


(This is immediate from the direct limit theorem, and _ the 
Chang-Los-Suszko theorem.) 

Note also that there are complete V4 theories which are not model 
complete. The most natural example I know is the theory of algebraically 
closed fields of fixed characteristic, with distinguished proper algebraically 
closed subfield (RoBinson [1959b]). By (iii) this theory has no model 
companion. 

Now I go back to consider the map T» T*. A remarkable development 
after 1969 was the following: Various constructions were found giving total 
maps T— T” which extend T» T*. 


DEFINITION. An operation T » T” on theories is a companion operator if 
(i) (T*)\v = Tv; 
(ii) if Ty = Ty, then T* = (T’)*; 
(iii) Tva C T”. 


It turns out that there are many interesting companion operators. But: 


THEOREM 14. Suppose # is a companion operator. Suppose T has a model 
companion T*. Then T* = T”. 


Proor. Ty =(T*)v. Thus T* = (T*)* D(T*)wa = T*, So 
T*C Tr", (1) 


Note that here I don’t use model completeness of T*, only that 7* is V4, 


to obtain (1). 
Now, suppose Jt T*. Using the mutual model consistency of T, T* and 
T”, one gets a chain 


M= Mo CM, CM.C:-- 
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with M.=limM,, where W.,F7*, and WaikT*. By 
Chang-Los-Suszko, lim Mtansi1E(T*)va, so Dt. (T*)va. But by model 
completeness of T*, 


M=M, < lim M,, = M.. Thus M= (T* rs. 


Since Dt was arbitrary, 
(T*)\wa G T*. (2) 


So, 
(T*)\aC T*CT*. 


Since T* is V3, it follows that (T*)va = T*. 
Finally, it follows that T* is model complete, since T* is, so (T*)va = T”, 
so T*=T* O 


Theorem 12 would be an immediate corollary, if we knew just one 
companion operator. I now give some examples, so we.have Theorem 12. 


Examples of companion operators 


Example 1. Kaiser Hul’ By (1) of the preceding proof, we see that if T’ is 
Vd and mutually model consistent with T, then T’C T”. This suggests the 
existence of a minimal companion operator, perhaps the maximal V4 
theory mutually model consistent with T. Why does such exist? Tya is 
mutually model consistent with T, and if T,, Tz are then so is T, U T2. This 
is proved by the ubiquitous alternating chains argument (KatseR [1969]). 
So we get T°, the Kaiser Hull of T, and T+ T° is obviously a companion 
operator. For a while it was the only one known. Notice how natural it is. 
An Vd sentence tells us, typically, conditions under which a system has a 
solution, surely the sort of thing one would want as an axiom for closed 
structures. 
This brings us to existentially closed structures and: 


Example 2. The operator Tb» T°. 
DEFINITION. Suppose M F Ty. Wt is T-existentially closed (T-e.c.) if MC 
M,- Ty implies Dt<, Ve,. 


Universal algebraic arguments (essentially union of chains) give us: 


THEOREM 15. Suppose Wt Ty. Then there is an MM, such that M, is T-e.c. 
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and MCM,. Moreover, Wy, can be chosen to have cardinal 
max(card(2), card(L)). 


Notation. E, = class of T-e.c. structures. 


I recommend Simmons [1972a] for the basic material on E;. There is for 
example the following implicit definition of Er: 

E; is the unique class € of models of Ty such that 

(i) every model of Ty is embeddable in a member of @; 

(ii) if M, NES and MCR then M<,M; 

(iii) if RE G and M<,MN then ME ¢. 
So the class Ey is mutually model consistent with the class of models of T, 
and E; satisfies the conditions of Robinson’s Test. Thus Th(E;,) is a good 
candidate for a model companion of T. The snag is that Ey may not be an 
ECy. 

Now we can define a new companion operator. 


DEFINITION. T° = Th(Er). 

THEOREM 16. T T° is a companion operator. 
This is a trivial verification. 

Corotiary. If T* exists, then T* = T°. 
But we can say more. 


THEOREM 17 (EKLOF and SaBBAGH [1971]). T has a model companion if 
and only if Ey is ECs. 


Proor. (<) Er satisfies conditions of Robinson’s Test, so if it is elemen- 
tary then Th(E;) is model complete. 

(>) If T* exists, T* = T°, so T* is model complete. If MF T°, MEM, 
for some MM, € Ey. So M < Mei, since T* is model complete. By (iii) of the 
implicit definition, M € E;. Ey = Mod(T*), so Er is ECg. 

This theorem gives us a workable criterion for showing that certain 
theories have no model companion. Here is a typical application, which 
suggested a lot of later research. 


Example (Ex.or and SaBBaGu {1971]). T = group theory. Let M, (n € w) 
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be in E;. It is trivial that there exist in M, elements x,, y, of orders 
n+1,n+2 respectively. Let D be a non-principal ultrafilter on w, and let 
wt =01M,/D. If Ey is ECs, then M € E;. But this gives a contradiction, 
thus. Let x =IIx,/D, y =Ily,./D, with the obvious notation. By the 
Fundamental Theorem on ultraproducts, x and y have infinite order. So by 
a basic combinatorial result (HiIGMAN, NEUMANN and NEUMANN [1949]) 
there exists a group extending Xt, with an element ¢ such that t”'xt = y. 
Since Yt € E;, there must be such an element ¢ in Pt. But then {n € w: x, 
and y, are conjugate in I%,}€ D. But this set is empty, since x, and y, 
have different orders. 


Note. An almost identical argument works for skew fields (SABBAGH 
[1970]). Related arguments work for nilpotent groups (SARACINO [1974b]), 
solvable groups (SARACINO [1974a]), Lie algebras (MACINTYRE [1974b]), 
commutative rings (CHERLIN [1973]), modules over non-coherent rings 
(ExLor and SABBAGH [1971]). 

Thus for these classes we cannot easily understand the existentially 
closed structures. 


3.4. The argument for Theorem 17 really shows that if T* exists, then Er is 
EC, and T* = Th(E,). So we can identify E; for such classes as fields, and 
ordered fields, by the results of Section 2. In this light we see T* as a 
metamathematical analogue of algebraic closure. Later, however, in the 
case of differential fields, we shall see that in analyzing the models of T* we 
may not get too much help from thinking of the transparent example of 
algebraically closed fields. The prime model extension phenomena of 
Section 2 do not drop out from model completeness. It was at this point 
that Morley’s analysis (Mortey [1965]) of algebraic became relevant to 
algebra. 


3.5. The forcing constructions 

In 1969 there was an unexpected development. Since 1963, people had 
tried, with little success, to modify Cohen forcing (COHEN [1966]) into a 
general model theoretic construction. In Reyes’ thesis (REYES [1967]) there 
is a Baire category approach linking forcing with homogeneous universal 
models. Reyes essentially obtained the notion of infinite forcing, which 
RoBInson [1971] would develop in a more formal way in 1969-1970. But 
first Robinson invented finite forcing (BARWisE and RoBINsoN [1970]), also 
known as model-theoretic forcing. The foundations of the subject are in 
Keisler’s contribution to this book. 
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I want to say a little about forcing as a method of construction, and to 
explain the basic properties of the forcing companions. 

T is given. There is a notion of forcing relative to T. The conditions are 
essentially finite fragments of diagrams of models of T. To construct a 
model by finite forcing one typically constructs larger and larger finite 
fragments of its diagram. The method is somehow like that involved in 
constructing recursively enumerable sets. If this remark is taken seriously, 
one can get applications by mixing combinatorial algebra and recursion 
theory (MACINTYRE (1972a]). One of the effects of Robinson’s development 
was to bring together classical model theory and recursion theory (MACIN- 
TYRE [1972a]). 

Once we have set up finite forcing relative to T, we define T“, the finite 
forcing companion of T, by 


T'= {9 [Ok 7g}. 


It is routine to prove that T' is a consistent theory. The following theorem 
holds even for uncountable logics L. The proof is in BARWIsE and ROBINSON 
[1970]. 


THEOREM 18. (a) T+ T‘ is a companion operator. 
(b) T‘ is complete if and only if T has the Joint Embedding Property. 


Then one gives (BARWIsE and Rosinson {1970]) a general definition of 
F,, the class of (finite) generic models. In general, if Mt € Fy, Me T'. But if 
L is uncountable, F; may be empty (HENRARD [1971], SHELAH [1972a]). 

The general facts are: 


THEOREM 19. (a) If L is countable, F>#@ and T! = Th(F;); 

(b) Fr C Er; 

(c) NEF, M<,M implies ME F;; 

(d) ME F; if and only if M completes T', i.e. iff MCRET' implies 
M< RN; 

(e) T* exists if and only if Fy is ECy. 


The proofs (BARwIsE and Rosinson [1970]) do not involve any complica- 
tions. 
We now have: 


Ferro, 


Each inclusion is proper when T is group theory (MACINTYRE [1972]). 
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Notes. (1) Later, ways were found of constructing T' and F; without 
forcing (Simmons [1973b], HENRARD [1973]). Also, one can implicitly define 
F;, along the lines of the definition for E; (Simmons [1972b]). 

(2) There is an important connection between the above forcing and the 
Omitting Types Theorem. The two methods are in some sense the same 
(KEISLER [1973], Simmons [1973a]). One should also look at SHELAH [1972]. 


3.6. Infinite forcing 

This is a blunter instrument than finite forcing, as far as constructing 
models is concerned. But it has nice properties, which can be rapidly 
developed. 

Again we fix T. This time we define a relation 


MiE e(m) (Mi forces ¢) 


where ItE Ty, y is an L-formula, and m is a tuple from M. 
The definition is the obvious one, with the all important recursion clause 


ME oy iff thereisnoN with M@CNE Ty and N Ikg. 


There is an obvious definition of generic model. Dt is generic if forcing and 
satisfaction coincide on Yt. Let G; be the class of generics, and let 
T* = Th(G,). 

One has the following implicit definition of Gr: 


THEOREM 20 (RoBINSON [1971]). Gr is the unique class € of models of Ty 
such that 
(i) every model of Ty is embeddable in a member of ©; 
(ii) if MNES and MCN, then M<M; 
(iti) if NEC and M<MN, then ME &. 


This again makes it clear that T* is a candidate for T*. 


THEOREM 21. (a) T» T® is a companion operator; 
(b) Gr C Er; 
(c) T* is complete iff T has the Joint Embedding Property ,; 
(d) T* exists iff Gr is ECs. 


There is an interesting connection with homogeneous universal models. 
One sees that the latter are existentially closed in a very strong sense — 
certain infinite systems of equations and inequations may be allowed. 

This leads to: 
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DEFINITION. Suppose Ptk Ty. MW is T-existentially universal (T-e.u.) if 
whenever IC ME Ty and Y(v) is a set of basic formulas over M, with 
finitely many free variables ‘v, and card(>) <card(M), then 


NE Avs(v) implies ME Avs (v). 


Homogeneous universal models of Ty are clearly T-e.u. 
The following nice theorem shows how to get G; without forcing: 


THEOREM 22 (FISHER [1970], Woop [1972]). Dt € Gr if and only if there is a 
M such that N is T-e.u. and M<MN. 


I recommend Woop [1972] for an elegant account of the preceding, and 
for extensions to infinitary logic. 


Notes. (1) MANevitz [1975] showed that Gr is not absolute for models of 
ZFC containing T. Both F; and Er are absolute. 

(2) Suppose T is countable. Then (see MACINTYRE [1972b]) Ey and F; 
are axiomatizable by single sentences of L.,,.. Gr is axiomatizable by a set 
of sentences of L.,,.. (Woop [1972]), but not in general by a single sentence 
(MacInTyrE [1975a]). 


3.7. Connecting T' and T* 
We have now 
oT 
a ets 
All inclusions are, in general, proper (MACINTYRE [1972]), and T'N T* may 
be bigger than T°. Note that T‘'N T® is also a companion operator, but 


does not satisfy the Joint Embedding Condition that T‘ and T* do. Finally, 
T‘U T® can be inconsistent (MAcINTyRE [1972a]). 


3.8. In some unpublished work from 1971-1972, I looked at other notions 
of T-forcing and forcing companions (MACINTYRE [1972c]). One can force 
with recursive conditions, etc. I found this useful in group theory. 


3.9. Connections between categoricity and model completeness 

There are easy examples to show that a complete « -categorical theory T 
need not the model complete. But there is a nice theorem of Lindstrom 
linking the two. Recall that a model complete theory T must be V3. 


THEOREM 23 (Linpstrom [1964]). Suppose T is complete WA and x- 
categorical, where x = card(L). Then T is model complete. 


cH. A.4, §3] MODEL COMPLETION AND FORCING 163 


Proor. Suppose not. Then by Léwenheim-Skolem there are models 
MCR of T of power « with MK, N. But We, R € E;, by categoricity. This 
is a contradiction. OO 


3.10. Saracino found a slightly surprising connection between N- 
categoricity and model completeness. 


THEOREM 24 (SARACINO [1973]). Suppose L is countable, and T is complete 
No-categorical. Then T has a model companion T*, and T* is No- 
categorical. 


Proor. We have to show that E, is EC,. Let &(v) be a universal formula 
of L. Let e(w) be the set of existential formulas y(v) such that T is a model 
of Vo (y(v)— w(v)). This formula is logically equivalent to an V-formula, 
so is a consequence of Ty. Now I quote the infinitary axioms for E;, 
namely: 


Ty U {ve (uiryo ee e(o))} 


See Simmons [1972a]. But, given yw, there are finitely many existential 
$1,..-,@n such that for any 9, y € e(), there is an i, 1 <i =n, such that 
TE Vv (¢(v)< ¢;(v)), by the familiar characterization of No-categoricity 
due to Ryll-Nardzewski (see CHANG and Keister [1973]). But 
TEVv(¢ - ¢,) iff Tae Vv (¢ ¢;). So, axioms for E; are 


Tva UV0 (b(v)  o1(v) v «+ v ga). 


So T* = T°. lhave now to show that T* is No-categorical. We have to show 
T* has < wn-types for each n. T* is model-complete, so we have to show 
T* has only finitely many inequivalent existential formulas. But T and T* 
have the same consistent J-formulas, and Ty3C T*, so the result is 
obvious. O 


Note. I have used only the following property of T — that T has only 
finitely many existential n-formulas. It is not clear to me that this is 
equivalent to No-categoricity. 


Example. T = dense linear order with end-points. T* = dense linear order 
without end-points. 


Example. Certain Boolean extensions of finite models. See Section 6. 
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It is reasonable in virtue of the theorems of Lindstrom and Saracino to 
hope that if T is N,-categorical, then T has an N,-categorical model 
companion. Unhappily, this hope was dashed: 

(i) T may not have a model companion (BELEGRADEK and ZILBER [1974], 
SARACINO [1975]). 

(ii) If T has a model companion T*, T* may not be N,-categorical 
(BELEGRADEK and ZILBER [1974]). 

One should observe however that if ¢ is «-stable and has a model 
companion then the model companion is «-stable too. 


3.11. Skolemization 

I wish to point out a recent, and surprising, theoretical advance by 
WINKLER [1975]. He shows that for many natural T (e.g. algebraically 
closed fields) the ‘free Skolemization” of T has a model companion. His 
proof involves the concept of algebraic boundedness arising from the study 
of N,-categoricity (BALDWIN and LACHLAN [1971]). 


4. Applications: Differentially closed fields and prime model extensions 


4.1. The aim of this short section is mainly philosophical. I will discuss an 
interaction between logic and algebra which has a different nature from the 
“positive” applications of Section 2 and the ‘‘‘negative’’ applications to 
come in Section 5. 

In this example Robinson’s ideas have to be supplemented by ideas from 
stability theory (MorLEy [1965], SHELAH [1972b]), and then logic con- 
tributes to the development of a branch of algebra. 


4.2. Ritt (see Ritt [1966]) introduced the notion of differential field. This is 
naturally formalized in the language of field theory augmented by a 
function symbol for the derivation. See Sacxs [1972]. SEIDENBERG [1956] 
found a criterion which tells when a system & over a differential field F has 
a solution in a (differential) extension of F. More precisely, he produced an 
algorithm which to a finite system ¥(v, m) assigned a V,-condition 3'(m) 
such that 


FE'(m) _ iff forsome extension F, of F, F,F Av 2 (ov, m). 


So & is unsolvable in all extensions of F if and only if FF 3’. — 3’ is 
equivalent to an existential formula. 
In characteristic 0, 2’ can be chosen quantifier-free. In positive charac- 
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teristic, Seidenberg’s analysis is much more complicated, and it takes a bit 
of work to reformulate his original statement in the way given above. 

At any rate, the fact that 3’ can be chosen V, easily gives us axioms for 
E; where T is the theory of differential fields of some specified characteris- 
tic. So: 


THEOREM 25 (RosINSON [1959] for p =0; Woop [1973] for p40). The 
theory of differential fields of characteristic p has a model companion. 


(The model companion is denoted by DCF,.) Note that the axioms for 
T* in this case were not too memorable. 
For p = 0, T* has elimination of quantifiers, so: 


THEOREM 26 (ROBINSON [1959]). The theory of differential fields of charac- 
teristic 0 has a model completion. 


Theorem 26 fails in characteristic p#0, because of failure of the 
amalgamation property. See Woop [1973]. 

. It turned out however that in characteristic p# 0 we may adjoin to our 
language a function symbol extracting p-th roots (and giving 0 when 
elements are not p-th powers), and then find a natural theory of differential 
fields with a model completion. We consider the theory of perfect 
differential fields of characteristic p. To get this we add to the theory of 
differential fields of characteristic p the axiom 


Vx (D(x)=0 implies Ay (y’ = x)). 


Woop [1973] showed that this theory has a,model completion. The p-th 
root symbol gives quantifier elimination, because it makes the axioms 
universal. In the language of differential fields, the model completion is just 
the earlier DCF,. In the extended language, the model completion is also 
called DCF,. 


4.3. There were several unusual features to this example. 

Firstly, the axioms for the model completions were nowhere near as 
perspicuous as in the case of fields or ordered fields. 

The second point was more perlexing. Think back to Section 2. In these 
examples one had not merely the operation of model completion, but also 
an operation of closure on structures. For example, a field K has an 
algebraic closure K which is prime over it in the following sense: Any 
embedding K — L, where L is algebraically closed, can be decomposed as 
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Moreover, any two such prime extensions of K are isomorphic over K. 

Finally, the prime extension is minimal. That is, the embedding K > K 
cannot be properly factorized as K—>L— K, where L is algebraically 
closed. 

There are analogous facts for the various examples in Section 2, and 
these facts are certainly of much interest to a model theorist. 

The natural question ts: 

Closure Problem. Are there always such facts associated with the model 
completion of a universal theory? 

For example, is there a good notion of differential closure of a 
differential field? 

No such notion was known classically. So here we have a case where the 
metamathematical closure is known, but no operation of closure on the 
relevant structures. 

East coast model theory was not able to clarify this matter further. 
Progress was made by combining the eastern and the western methods. 
Bum [1968] showed that the concepts of Mortey [1965] enable one to 
obtain a differential closure in characteristic 0. 


4.4, In the examples of Section 2, the closure of K is algebraic over K, in 
the classical field-theoretic sense. This is not so in the differential case, 
simply because there are differential equations whose solutions are trans- 
cendental over @. 

Mor ey [1965] provided a beautiful notion of algebraic in a general 
setting. In the formulation of BLum [1968] there are notions a-algebraic 
for each ordinal a. The case a = 0 is the familiar classical notion for fields. 
Blum showed that we have to use n-algebraic for all n < w to construct the 
differential closure. 

Her main results are: 


THEOREM 27. (a) Without using Seidenberg’s result, and using instead an 
analysis of simple extensions, one can find the model completion DCF, and 
with a much nicer set of axioms. 

(b) DCF, is w-stable, and so every differential field of characteristic 0 has 
a prime model extension to a model of DCF». 

(c) Seidenberg’s result can be derived from the model theoretic analysis. 


(It should be pointed out that there is a residue of Seidenberg’s analysis 
in Blum’s work, in the form of certain reduction procedures.) 
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So now we have the notion of a differential closure, in characteristic 0. 

Later, Woop [1974] carried out a corresponding analysis for perfect 
differential fields of characteristic p. The analogues of (a) and (c) hold; (b) 
has to be modified, since the theory of perfect differential fields is not 
w-stable. Wood showed that nevertheless it has the related property that 
the isolated points are dense, and so by Morey [1965] one has the prime 
model extension property of (b). So one has also a differential closure in 
characteristic p. 

The question of uniqueness of differential closure is trickier. Actually the 
differential closure is unique, but this needed an advance by SHELAH 
[1972b] in the general theory. Shelah showed that for w-stable T the prime 
model extension is unique. Later, in SHELAH [1975], he showed that for 
stable T the prime model extension is unique if it exists. SHELAH [1974] 
showed that DCF, is stable. 

So, putting together the material of the preceding paragraphs we get: 


THEOREM 28. The differential closure is unique in all characteristics. 


Minimality. Unfortunately, the differential closure is not minimal, at 
least in characteristic 0. This was proved independently by Kotcuin [1974] 
and SuHetaH [1974]. One should consult SHELAH [1972b] for the 
Morley—Shelah theory underlying this. 

The characteristic p case is unresolved. 


Notes. (1) In Sacks [1972] there is a Nullstellensatz which follows fom the 


basic model completeness result. 
(2) Till now there is no application with the impact of those in Section 2. 


5. Negative applications: Groups, skew fields, and number theories 


5.1. The new methods of Section 3 give new constructions for existentially 
closed structures. The main direction of the new ‘applications’? was to 
show that Ey is an extremely complicated class, for certain natural T. 
Further, one could provided counterexamples to existence of closures 
and/or minimal closures. A very comprehensive account of this develop- 
ment has been published in HirscHFELD and WHEELER [1975]. We begin 
with a brief summary of the method, and then pass to more concrete 
algebraic problems. 


5.2. The non-model-theoretic background needed is: 
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(a) T = group theory: Combinatorial group theory as in HIGMAN, 
NEUMANN and NEUMANN [1949]; 

(b) T =theory of skew fields: Cohn’s work in Coun [1971b]. 

(c) T =a number theory: the basic theory of r.e. sets, some ideas going 
back to Rasin [1961], and the theorem of MatesAsEvic [1970]. 

The fundamental fact established in all three cases is: 

There is a uniform method assigning to each Mt in E; an interpretation 
P(M) such that 

(a) P(M) is a substructure of 2-nd order arithmetic; 

(b) the interpretation persists under extension; ; 

(c) if Mis T-existentially universal, P(M?) is the standard model of 2-nd 
order arithmetic. 

We can readily conclude that 2-nd order arithmetic is Turing reducible to 
T®. The converse is a general fact for arithmetical T such as the examples 
given at the beginning of this section. So: 


THEOREM 29 (HIRSCHFELD and WHEELER [1975], MACINTYRE [1971]). For the 
T given above, T® is not an analytic set. 


Along the same lines one shows: 
THEOREM 30. For the T given above, T' is Aj but not arithmetic. 


CoroLiary. T'# T®. 


The proofs of these fundamental results involve much coding and 
definability theory. One should consult HirscHFELD and WHEELER [1975]. 


5.2. Existentially closed groups 

A shortcoming of the work reported in 5.1 is that it does not give results 
which are intelligible to an algebraist unfamiliar with recursion theory. In 
this subsection I will describe some natural algebraic results proved by 
forcing. 

In this subsection, T is the theory of groups, formalized in the usual logic 
with -,~', e. Since T is a universal theory, the members of E; are groups 
and are in fact precisely the non-trivial algebraically closed groups con- 
sidered by W. R. Scott in the early 1950’s (Scott [1951]). Scott obtained 
essentially a special case of Theorem 15. 

Later, NEUMANN [1952] showed that e.c. groups are simple. Also, a trivial 
counting argument gives 2”° isomorphism-types of countable e.c. groups. 


cu. A.4, §5] APPLICATIONS TO GROUPS 169 


But it emerged that we knew no invariants for e.c. groups, i.e. we could not 
see how to tell apart, informatively, two countable e.c. groups. Confirma- 
tion of the difficulty came with the beautiful theorem of NEUMANN [1973]: 


THEOREM 31. Any finitely generated group which can be presented with 
solvable word-problem is embeddable in all e.c. groups. 


This gave a focus to efforts using finite forcing. Neumann had used 
Higman’s Embedding Theorem (HiGMaAN [1961]), and this would soon be 
used in combination with finite forcing (Macintyre [1972a]). Also, 
Neumann’s result raised the question: Which finitely generated groups are 
embeddable in all existentially closed groups? 

This turned out to have a nice answer, and an easy demonstration using 
finite forcing. 


THEOREM 32 (Macintyre [1972a]). If @ finitely generated group does not 
have solvable word-problem, then there is an e.c. group in which it is not 
embeddable. 


(This is an instance of a general theorem applying to skew fields, Lie 
algebras, etc.) 

So, for the first time, one had a purely algebraic criterion for when a 
finitely generated group has solvable word-problem. (Later, Boone and 
HicMAN [1974] gave a quite different criterion.) 

But the most important problem was to construct countable e.c. groups 
that genuinely “‘look different’. An observation of MAcINTYRE [1972a] and 
FisHER [unpublished] gives: 


THEOREM 33. Two countable e.c. groups are isomorphic if and only if they 
have the same isomorphism -types of finitely generated subgroups. 


This suggests that the following problem might have a positive answer: 


Bers’ Problem: Are any two e.c. groups elementarily equivalent? 


It turned out (MACINTYRE [1972a]) that the answer is negative. T° is not a 
complete theory. I constructed a V, sentence ¢ (to be explained below), 
which has clear algebraic meaning in members of E7, such that ¢ holds in 
some members.of EF; and fails in others. Later, BELEGRADEK [1974a] and 
MILLER [1974] obtained an V;-sentence. On general grounds it is known 
that T° is complete for V2 sentences. 
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The new phenomenon first observed here is the following: There are 
classes @ of groups such that € is not EC,, but € Er is of the form 
€, Ey, where €, is EC (axiomatizable by a single sentence). In other 
words, there are concepts which are not first order for the class of all 
groups, but are first order relative to the class of e.c. groups. 

A trivial example is: 


€ =class of simple groups. 


More complex examples are: 

(1) € =class of groups which have a 2-generator subgroup into which all 
other 2-generator subgroups are embeddable (Macintyre [1972]). 

(2) For any finitely presented group #, the class € of groups in which # 
is embeddable (BELEGRADEK [1974], MILLER [1974]). 

Both proofs use combinatorial group theory and coding devices. 

Bers’ Problem was solved by showing that the class @ of Example 1 gives 
a non-trivial partition of E;. To show this, I used Higman’s universal 
finitely presented group (HicmMAN [1961]) and a finite forcing argument. 
This gives € N E,A@. To get ErZ €, one observes that no existentially 
universal group is in @, using a counting argument. 

Let Max be the sentence describing @ M Er. Max is V4. Later (unpub- 
lished), using forcing with conditions recursively enumerable in a certain 
set, I showed that every countable member of Ey, is embeddable in a 
member of €. Also, BELEGRADEK [1974b], Macintyre [1971], MILLER 
[1974] and WHEELER [1972] independently proved that there are 2% 
elementary types of e.c. groups, and I proved that there are 2° elementary 
types satisfying Max. These results, being essentially codings of results 
about recursively enumerable sets, give much less algebraic information 
than the original solution to Bers’ Problem. 

Here are some other results proved by forcing (MACINTYRE [1972a]). (c) 
and (d) are in strong contrast to results of Sections 2 and 4. 

(a) Gr N F; =9; 

(b) every countable e.c. group has a proper L.... extension; 

(c) every countable e.c. group contains a proper copy of itself (so there 
is no minimal member of E,); 

(d) there is no e.c. group embeddable in all others (so there is no prime 
model of E,). 

These results used in varying degrees material on the word-problem for 
groups. As might have been expected, such combinatorial techniques mix 
nicely with finite forcing. 

Later results. BELEGRADEK [1974a, b] and MILLER [1974] obtained good 
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new results on the problem of when one can put one given group in as a 
subgroup of an e.c. group, and leave out another given one. Very recently 
ZIEGLER [1975] improved (b) to get extensions of each infinite cardinal, 
superseding Macintyre [1973b]. In particular, F; has members of all 
infinite cardinalities. 


5.3. Skew fields 

The study of e.c. skew fields began in 1970, once Coun [1971b], in a 
brilliant advance, made available suitable analogues of the combinatorial 
techniques in group theory. SABBAGH [unpublished] quickly showed that 
there is no model companion for the theory of skew fields. Later work was 
done by Borra and VAN PRAAG [1972], MAcINTYRE [1971] and WHEELER 
[1972]. 

For this subsection, let T be the theory of skew fields. 

In Macintyre [1971] I showed that T® is not analytic and T" is not 
arithmetic. Also, there are 2"* elementary types of members of E;. The 
proofs used finite forcing, and did not have much explicit algebraic content, 
being codings of phenomena in recursion theory. Independently, WHEELER 
[1972] found much nicer proofs. We both did the analogue of (a) from 5.2, 
and Wheeler did (b) and (c). To get (d), I had first to prove (MACINTYRE 
[1973c]) the unsolvability of the word-problem for skew fields. 

There are some major problems open. There is an analogue of Max (cf. 
5.2) for skew fields, but it is unknown if there are members of E; satisfying 
Max. The main problem is the absence of an embedding theorem of 
Higman type (cf. 5.2). To get such a theorem seems rather hard. Anyone 
interested should look at MACINTYRE [1975p]. So at present there is no 
natural way of telling e.c. skew fields apart. Another problem is to extend 
the final results of 5.2 to skew fields. 

Nullstellensatz. It is rather disappointing that the structure of e.c. skew 
fields is so complex. For one might have hoped they would be a useful tool 
in ring theory, and a setting for some future non-commutative algebraic 
geometry. 

What about a Nullstellensatz? Let K be a skew field, and let 
K (x,,...,X,) be the ring of polynomials over K in non-commuting 
variables x,,...,X,, not commuting with K. Suppose now K is e.c. and I is 
an ideal in K (x,,...,X,) with a zero in some extension skew field. Does I 
have a zero in K? 

It obviously does, if I is finitely generated. But WHEELER [1972], on the 
basis of the analogue of 5.2 (b), (c), gave a counterexample to the general 
case. 
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For existentially universal skew fields the Nullstellensatz does not fare 
quite so badly, and there is still some hope for an algebraic geometry. 


5.4. Number theories 

Let T be full number theory. In BARwisE and Rosinson [1970], it is 
proved that F; = {N}, and in fact Mod(T)/N Er = {N}. In Rosinson [1973] 
he showed that one can define N in any member of E; (cf. 5.2). His proof 
uses a technique poineered by Rasin [1961], with simple sets used instead 
of creative sets. 

Later the whole area was investigated systematically by GoLDRE!, MAc- 
INTYRE and Simmons [1973] and HirscHFeD [1972]. See also HirsCHFELD 
and WHEELER [1975]. The general flavor of results is that the situation is as 
chaotic as for groups and skew fields. The proofs use the same mixture of 
mode] theory and theory of recursively enumerable sets. 


5.5. Other systems 

There is reason to believe that Lie algebra, nilpotent groups of specified 
class, and solvable groups of specified class, will have E; as complex as the 
above. See Macintyre [1974b] and Saracino [1974a, 1974b], and Macin- 
TYRE and SARAcino [1974]. 

For commutative rings, the situation is more obscure. There is no 
model-companion (CHERLIN [1973]), but no evidence of combinatorial 
complexity has been found (despite Tartstin [1974]). 


6. Sheaves and model completeness 


6.1. I want to end with a subject where there are positive results, 
connections with Section 2, and some unexplored research paths. The 
discussion will be brief, since I wish mainly to communicate some formal 
ideas. 

The starting point is a theorem of Carson [1973] and Lipsuitz and 
SARACINO [1973]. They proved that the theory of commutative rings with 1 
and no non-zero nilpotent elements has a model companion. (Contrast this 
with CHERLIN [1972], cited in 5.) In fact, they gave nice axioms for 
existentially closed commutative regular rings with 1. It is easy to see the 
mutual model-consistency of the theory of commutative rings with 1 and no 
non-zero nilpotents, and the theory of commutative regular rings with 1. It 
is more convenient in what follows to work with T, the theory of 
commutative regular rings with 1. 

In order to reach a general viewpoint, Carson’s sheaf-theoretic analysis 
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seems to me more natural. By general representation theory (PIERCE 
[1967]) any model of T is isomorphic to the ring of global sections of a 
sheaf F of fields, over a Boolean space X. With the Boolean space one has 
its dual algebra B, and in this case B is the algebra of idempotents of the 
original ring. Now it emerged from the analysis of the above authors that 
the members of E; are precisely those models of T where B is atomless, 
and the stalks of the sheaf are algebraically closed fields. 

But now the pattern is clear. To get the model companion of T we do 
two things: 

(i) Take the model companion of the theory of Boolean algebras (cf. 3.2, 
Example 4); 

(ii) take the model companion of the theory of the stalks. 
Then the members of Er are sections of sheaves over spaces X, where X is 
Boolean with atomless dual, and the stalks are models of the model 
companion of the theory of the stalks of the original class of sheaves. 

Macintyr_ [1973a] and WEIssPFENNING [1973], on the basis of this formal 
observation, put the Carson—Lipshitz—Saracino result into a general set- 
ting. Here is a specimen result: 


THEOREM 34 (MACINTYRE [1973a]). Let T, be a theory of fields. Suppose T, 
is model-complete. Let T be the theory of rings of sections of sheaves F, with 
stalks models of T,, and over a Boolean space without isolated points. Then 
T is model complete. 


When T, is the theory of fields, we get the Carson-Lipshitz—Saracino 
theorem. VAN DEN Driess [1975], MAciNtYRE [1973a] and WEISSPFENNING 
[1973] looked at what happens when T, is the theory of real closed fields. It 
turns out that this gives the existence of a model-companion for a class of 
lattice-ordered rings, the so-called regular f-rings (see BuRKHoFF [1967)). 

The general principle is that any of the model completeness results of 
Section 2 will transform to an analogous result about regular rings. For 
example, VAN DEN Driess [1975] and WEISSPFENNING [1973] have results on 
differential rings. The p-adic case has not been closely studied. 


6.2. The results above are not in general as strong as those of Section 2. 
The problem is that we do not in general have a prime mode] extension toa 
member of Ey. See SARACINO and WEISSPFENNING [1973]. For T = theory of 
regular rings, Lipsitz [1974a] has greatly clarified the question of the 
existence of prime model extensions. 

In the real regular case (regular f-rings), there is a unique prime model 
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extension. See Lipsuitz [1974b] and VAN DEN Driess [1975]. From this, Van 
den Driess obtained an analogue of the solution of Hilbert’s 17-th Problem 
for real regular rings. I believe Robinson would have admired this result. 


Notes. (1) This material connects with No-categoricity. By results of 
WEGLOrz and Waskiewicz [1968], a Boolean extension of an N,-categorical 
structure by an atomless Boolean algebra is N,.-categorical. So by SARACINO 
[1973], its theory has a model companion. On the other hand, a Boolean 
extension is the structure of sections of a sheaf over a Boolean space. Is the 
model companion the theory of some natural Boolean extension? 

(2) In my version of Theorem 34 I used Comer’s results (COMER [1974]) 
on Feferman-Vaught theorems for sheaves. Later Comer using the formal 
principle of 6.1 obtained model completeness results for polyadic algebras 
(Comer [1973}). 

(3) It seems to me that the results of this section should be interpreted 
topoi-theoretically, i.e. by doing model theory inside a category of sheaves. 
The intuition is that the sheaf associated with an e.c. regular ring is an 
algebraically closed field in the sense of a category of sheaves. Work is 
under way on this and related matters (Joyat [1975], Louis [1976]). 
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1. Introduction 


A free group on a set X of generators has the property that every 
one-one map of X into itself induces a monomorphism of the group into 
itself. Further, if the one-one map of X is surjective, the induced 
monomorphism is actually an automorphism. 

To what extent can models with similar properties by constructed for 
general theories? (By theory we usually, but not always, mean a theory in 
the language of the first order calculus.) More precisely, for a given set X, is 
there a model 9% of T such that every one-one map of X into itself induces 
a monomorphism of %? Obviously, if T involves a linear ordering, 
non-order preserving permutations of X can not be extended to 
monomorphisms of %. Rather surprisingly, the following theorem states 
that linear orderings are the only exception to the rule. 


1.1. THEOREM (Ehrenfeucht-Mostowski). Let T be a countable theory of the 
predicate calculus which has an infinite model and let (X, <) be an infinite 
linearly ordered set. There is a model % of T with X C A such that every 
order preserving map of X into itself induces a monomorphism of % into itself 
and every order preserving onto map of X induces an automorphism of %\. 


Models with interesting properties can be constructed by varying the 
order type of (X, < ). Part of the sequel will be devoted to showing to what 
extent % is determined by the order type of (X, <). 

Section 2 is devoted to proving a weak form of the 
Ehrenfeucht-Mostowski theorem. Section 3 discusses how to generate 
submodels and strengthens the results of Section 2. Section 4 discusses 
extensions of the Ehrenfeucht-Mostowski theorem to non-elementary 
languages. Section 5 gives an application of the Ehrenfeucht-Mostowski 
theorem to group theory and discusses theories whose models contain 
relatively large homogenous sets. Section 6 considers some implications of 
the existence of homogenous sets of order type w, in models of set theory. 


2. Partition theorems and the E.-M. theorem 


First, the introduction of some notations and definitions. The notation 
X” signifies the set 


X™={y: y CX and |y|=n}. 
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As we shall assume the axiom of choice throughout, we may as well 
consider the set X to be linearly ordered. The set X“” is the set of n-tuples 
{(Xo5. +5 Xn-1)3 X% © X, Xo< NX +° + < Xq-1}. We shall refer to X as the set of 
properly ordered n-tuples of X. 
The ordinal y is the set of ordinals < y. As usual, the least cardinal > x 
is written «*. For each ordinal a, 2(x, a) is defined by induction: 
2(x, 0) =k 


2(k, @) = sup{2*"; B <a, a >0}. 
In particular, define 
3,. = 2(w, a). 


The generalized continuum hypothesis in this notation is x, = 3. forall a. 

Assume a partition P of X™ into w disjoint classes. A set Y CX is 
homogenous for P if Y“ is included in one partition class. If for any set X 
of power « and any partition P of X™ into uw disjoint classes, X contains a 
set Y homogenous for P of power A, we write 


K->(Ay. 


Proofs for the following theorems about the cardinality of homogenous 
sets are given in Chapter B.3 on combinatorics. 


2.1. THEOREM (Ramsey’s Theorem). If m is finite, then 


wo —>(w)n. 


2.2, THEOREM (Erdés—Rado). For any infinite cardinal x, 


(2(k, n)) > (K")e. 


2.3. THEOREM (Erdés). There is no cardinal x such that 
kK —>(w)? 


i.e. infinite homogenous sets do not exist for arbitrary partitions of the set X“ 
of denumerable subsets of X. 


2.4. THEOREM (Ehrenfeucht-Mostowski). Let T be a countable theory in a 
language L of the predicate calculus which has an infinite model and let 
(X, <) be an infinite linearly ordered set. The theory T has a model %, 
X C%, with the property that for all n all properly ordered n-tuples 
(Xo,-.-,Xn-1) Of X satisfy the same set of formulas ¢(vo,..., Un—1) of L. 
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Remarks. The set X is said to be homogenous for the model &. The set ® 
of formulas satisfied by the properly ordered tuples of X is called an 
Ehrenfeucht-Mostowski (E.-M.) set. From the Compactness Theorem, it 
follows that for any E.-M. set ®@ and linear ordering (Y, <), there is a 
model containing (Y, <) as a homogenous subset whose properly ordered 
tuples satisfy the formulas of ®. 


Proor. Let 8 be an infinite model of T and go, ¢;,... an enumeration of 
the formulas of L. Impose a linear ordering < on B. (In general, this 
ordering is unrelated to the relations of 8.) Suppose go has free variables 
Vo,..., Un. The formula go determines a partition of B“*” into two classes 
depending on whether the properly ordered n+ 1-tuple (bo,..., 5.) 
satisfies yo or not. By Theorem 2.2, there is an infinite B,) CB homogenous 
for this partition. Let g, have free variables vo,..., u%. Then ¢, determines 
a partition B§*” into two classes with an infinite homogenous subset 
B,C Bo. Continuing in this fashion, we construct a nested sequence 
{B;: i€ w, B,.,CB,} of infinite sets B,. Enlarge L by adding a new 
constant x for each x © X and enlarge T by adding for each formula 
Gi =Gi(Vo,...,Ue) and each xo<---<x, of X“*” the sentence 
Gi (Xo, ...,Xe) OF —E(Xo,..., xX.) depending on whether the properly or- 
dered k + 1-tuples of B; satisfy ¢ or —. The enlarged theory is finitely 
consistent and so has a model 2 satisfying the conclusion of the theorem, 
by the Compactness Theorem. © 


2.5. COROLLARY. If y(v) is satisfied by an infinite set of elements in some 
model of T, then it is consistent to add the condition that the elements of 
(X, <) satisfy p(v) in the model X of Theorem 2.4. If p(vo, v:) linearly 
orders an infinite set in a model of T, then in % the ordering < of X may be 
defined by ¢. 


2.6. DEFINITION. A subsystem of a relation structure 2 is any subset of A 
which is closed under the functions defined in 2. A subsystem generated by 
a subset X of A is the smallest subsystem of % containing X. 


Thus, an element of a subsystem of % generated by X is represented by a 
term t(Xo,...,xX,) where x; € X and ¢ is built up from the functions of 2. 

Consider any order preserving map f of the set X homogeneous for the 
model %. The map f can be extended to the subsystem generated by X by 
defining 
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f(t(%o,..., Xn)) = t(fto,..., fn). 


It follows immediately from Theorem 2.4 that f induces a monomorphism 
of the subsystem into itself and that if f is actually onto, the monomor- 
phism is an automorphism of the subsystem. Thus, if we can arrange that 
the subsystem of % generated by X is actually a model of T, we will have 
accomplished the goal undertaken in Section 1. This feat constitutes the 
next section. 


3. Skolem functions and elementary submodels 


Let L be a language of the predicate calculus. Corresponding to each 
formula ¢(vo,..., 0.) of L, introduce an n-ary function symbol f, (the 
Skolem function) to form a new language L . If n = 0, the Skolem function 
f, is a constant c,. The languages L* and L* are defined inductively: 


3.1. DeFiniTION. The defining sentence of the Skolem function f, is the L* 
sentence 


Vivo vse Von [Av, ¢ (vo, seey Un) ¢ (vo, eeey Un-1s fo (vo, ry Vn-1))). 


3.2. DEFINITION. A relation structure % of the language L is a Skolem 
model if every defining sentence of L* is valid in M1. If T is a theory in L its 
Skolem closure is T* = T U {defining sentences of L*}. 


3.3. DEFINITION. The terms of the language L* are called Skolem terms. 


Under the assumption of the axiom of choice, every relation structure % 
of the language L can be expanded to a Skolem model %* with the same 
universe as % such that for all formulas ¢ of L, %F @(ao,..., an), a € A if 
and only if %*F @(ao,..., an). 


3.4. DeFinition. An elementary submodel % of the relation structure % of 
the language L is a subsystem of 2 such that 


BE p(bo,..., da), bE B 
if and only if 
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WE e(bo,..., bn) 


for every formula ¢ of L. 


Thus, in particular, the same set of sentences of L is valid in % and B so 
that 2 and B are models of the same complete theory. The fact that 8 is an 
elementary submodel of % is denoted by 8 <. It can be shown that 
% < % if for every formula ¢ of L 


WE Avy e(vo, b,..., bn), b, € B, 


if and only if there is a b) € B such that 
oy | F (bo, b,, srepacy b,.). 


Let B be any subset of the Skolem model %* of the language L*. From 
the preceding paragraph and from the fact that a Skolem model satisfies 
the defining sentences of L*, it follows that the subsystem generated by B 
in %* is an elementary submodel of the model %*. This subsystem is the set 
of terms {t(bo,...,5,): b: € B, t a Skolem term}. 

Now to finish off the Ehrenfeucht-Mostowsk theorem 1.1. In the proof 
of Theorem 2.4, assume T is a Skolem theory and 8 a Skolem model. The 
model % whose existence is asserted by Theorem 2.4 is now a Skolem 
model. The elementary submodel of 2% generated by X is a (Skolem) model 
of T which satisfies all the conclusions of Theorem 1.1. 


3.5. DEFINITION. The elementary submodel of the (Skolem) model % of 
Theorem 2.4 generated by X is denoted by “(®, X) where @ is the E.—-M. 
set of formulas in the language L* satisfied by the properly ordered tuples 
of X. 


We devote the rest of this section to some properties of the models 
M(®, X). 

(1) Every theory T has a model with a non-trivial automorphism. 

Proof. Take for (X, <) a linear ordering such as the rationals which has 
a non-trivial order automorphism. 


3.6. Derinition. A monomorphism f between the relation structures % 
and & of the language L 


f:A>B 


is elementary if it has the property that for all formulas ¢ of L 
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WE e(ao,..., an), aEA 
if and only if BE ~(fao,..., fan). 


(2) Consider two structures (®, X) and M(®, Y). 

(a) Any order preserving map f of X into Y induces an elementary map 
of M(®, X) into M(®, Y). 

(b) The elementary monomorphism induced by f is surjective if and 
only if f maps X onto Y. 

Proof. (a) Obvious since properly ordered tuples of X and Y satisfy the 
same formulas. 

(b) Assume ye Y-f(X) and yeEf(m(®,X)). Then y= 
t(fxo,..-5 fm) = t(yo,-.-,¥m) for some Xo < +++ < Xmn E X and yo< ++ < 
ym EY, y#y, OSisM Assume yo<--'<ye <y < Year < 00+ < Ym 
Since Y is homogeneous with the E.-M. set ®, the formulas 


Vin = t(Vo, + oy Ves Vi s2s 6+) Uma) 
= (Vo, .- +5 Vks Visty oes) Viem—k) = U 


with [4k +1 are in ®. But these formulas imply that Y contains at most 
m +2 distinct elements which contradicts the fact that Y is an infinite 
linear ordering. O 


4. w-logic 


An w-logic is a language with a constant for every natural number and 
two sorts of variables; one sort ranges over the set of natural numbers, the 
other sort ranges over the universe. The set of variables which range over 
the natural numbers we enumerate as Wo, W;,...,W,,...3 the other set of 
variables we enumerate aS Uo, 0;,..., U,,... . For more details, consult 5.2 
of Chapter A.1. The results of this section apply equally well to weak 
second order logic, as defined in 5.3 of Chapter A.1. 

Does the Ehrenfeucht-Mostowski theorem 1.1 generalize to w-logic? 
Obviously there is no difficulty in adding Skolem functions to the language 
L of w-logic to form the language L* and in writing down defining 
sentences for them. The difficulty arises in the construction of the E.-M. set 
® of formulas. How do we deal with formulas of the form og = 
Awo(t(vo,..., Un) = Wo)? We must assign to f(v0,...,Un), Vo<s'*< Uh, 
some particular natural number w. In general, the formula ¢ will induce a 
partition P of n+ 1-tuples into w disjoint classes. Thus, in place of the 
Ramsey Theorem, an appeal to the Erdés—Rado theorem is necessary. 
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4.1. THEOREM. Let T be a countable theory of an w-logic L. If T has 
w-models in all powers 3., a € a, then given any linearly ordered set 
(X, <), there is an w-model YX of T, X CA, such that, in UA, all properly 
ordered tuples of X satisfy the same set of formulas of L. 


Proor. Assume T is a Skolem theory in the w-logic L* and the ., 
{M. | =3., are Skolem models. Linearly order each %, and enumerate the 
formulas of L*: go, G1,..-, Gn... There are two cases to consider at the 
first step. 

(1) @o = 3wo(t(vo,..., Un) = Wo). The formula go induces a partition P of 
A‘"*” into w-classes, one which is the set of properly ordered n + 1-tuples 
which satisfy po' = — @o, and w order classes each of which is composed of 
properly ordered n + 1-tuples which satisfy the formula 


Po =AWo(t(v0,..-; Un) = Wo) AL(Vo,.-.,0 =, LE. 

By the Erdés—Rado theorem, there is a set C2.,C Aa+n homogeneous for 
P and |C?2..{23.. The particular formula 95, k = —1, satisfied by the 
properly ordered n + 1-tuples of C.., depends on a, but since there are 
only w choices and » is not cofinal in w,, there is a cofinal subsequence 
{BS, a € w,} of the sequence {C2, a € w,} such that for some k = — 1, for 
all a < w, the properly ordered n + 1-tuples of B® satisfy the same formula 
go and | B8|=4.. 

(2) Po = Go(vo,..., Un) and @o is not of the form Awo(t(vo, ..., Un) = Wo). 
The formula go induces a partition P of A2*' into two disjoint classes, one 
composed of those properly ordered n + 1-tuples which satisfy the formula 
~o= ¢o and those which satisfy go'=— @o. By Erd6és—Rado, there is a 
set Co.nCAas+n homogenous for P and |C2.,|=3.. Noting that 2 is not 
cofinal in w,, we find a cofinal subsequence {B&, a € w,} of the sequence 
{C%, a € w,} such that the properly ordered n + 1-tuples of Bo satisfy just 
one of the formulas 96 = go or Yo' = 1 @p for all a Ew. 

Continuing the induction, construct a set of sequences {{B%,a € 4}, 
n © w} such that 

(i) |B2|=2., 

(ii) B3*'C Bs for some B and the set of ordinals 


{B: da € w,(B2*'C B5)} 


is cofinal in a. 
(iii) There is k = — 1 such that the properly ordered tuples of B¢, for all 
a E w,, satisfy pt. Denote the formula y5 by ¥,. 
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The E.-M. set ® constructed by this process is the set of formulas 
{dé n € w}. 

The conclusion of the proof of Theorem 4.1 is similar to that of Theorem 
2.4. 0 

The proof of Theorem 4.1 yields directly a form of the 
Ehrenfeucht—Mostowski theorem 1.1 for theories T of w-logic. © 

Another non-elementary language is L.,,.. The formulas of a language of 
this sort are built up by induction from the formulas of a predicate calculus 
by allowing countable conjunctions and disjunctions but only finite quan- 
tifications. No formula of L.,.. has more than a finite number of free 
variables. 

Theories T which are sentences of L.,.. behave like theories of a 
countable w-logic. This phenomenon is explained by the fact that a 
countable disjunction Vic. ®;(vo,..., 0.) of formulas ®, of the predicate 
calculus can be replaced by the singke formula Awo ®(Wo, vo,..., Ua) of 
w-logic with (i, vo,..., vs) = ®;(vo,..., Va) and a countable conjunction 
Niew Gi(Vo,-.., Un) by the single formula V wo (Wo, Uo, ..., Un) Of w-logic. 
Thus, Theorem 4.1 applies equally well to theories T which are sentences 
of Lu... 


5. Types and Ehrenfeucht—Mostowski models 


The next two theorems have interesting applications to algebraic 
theories, two of which are discussed in this section. 


§.1. DeFinition. A one-type of a theory T of the language L is a maximal 
set {y; = gi(vo): ¢ a formula of L} of formulas consistent with T. An n-type 
of a theory T of the language L is a maximal set {¢, = 9; (vo, ..., Un—1)! Qi a 
formula of L} of formulas consistent with T. A one-type of a theory T of L 
with respect to a relation structure © of L is a maximal set 
{gi (Vo, b:,..., bn): Bb E B, G = Gi (v0,..., Va) a formula of L} of formulas 
with parameters from B consistent with T. 


5.2. THEOREM. A countable theory T of the predicate calculus which has an 
infinite model has models %, in every infinite power x, which contain only a 
countable number of types with respect to any countable subset of A. 


Note. It will follow from the proof of Theorem 5.2 that if T is a countable 
theory of w-logic which has w-models of power > 4, for all a € w,, then T 
has w-models % in every infinite power « which satisfy the conclusion of 
Theorem 5.2. 
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Proor. We will show that if « is an infinite cardinal and ® is an E.-M. set 
of formulas for the Skolem theory T, then (®,«) where « is the set of 
ordinals < « contains only a countable number of types with respect to any 
of its countable subsets C. An element of “(®, x) corresponds to a Skolem 
term t(Ao,..., An), Ai E K, Ao < +++ <A,. The elements of the countable set 
C correspond to terms in which the A; are elements of some countable 
subset C’Cx. The set C’ is called the support of C. With respect to C, the 
type of a given term f(Ao,...,A,) is determined by the term f(vo,..., v,) 
and the order of the A; with respect to C’. The C’ Cx is well ordered; there 
are only a countable number of ways to interpolate a finite set into it. Since 
there are only countably many Skolem terms ¢(vo,..., v,), there are only 
countably many types with respect toC. O 


§.3. THEOREM. A countable theory T which has a definable infinite linear 
ordering has non-isomorphic models in every uncountable power. 


Proor. Applying the theorem above, we need only produce models % of 
power k =w, which have an uncountable number of one-types with 
respect to a countable subset C CA. Assume T is a Skolem theory, let the 
linear ordering be defined by (vo, v,) and let (X, <) be a linearly ordered 
set of power w, containing a dense countable subset W. For every element 
x € X add a constant to T and, also, the sentences y (Xo, x;) if and only if 
X) >x,. The expanded theory T, is finitely consistent and therefore has 
models in every uncountable power «x. Each model % of T, contains 
uncountably many types with respect to the countable subset W since each 
x © X realizes a different cut with respect to W. OU 


In Macintyre and SHELAH [to appear] there is a nice application of 
Theorem 5.1. They prove the existence in every uncountable power x of 
non-isomorphic universal locally finite groups. The following material is 
taken from their paper. 


5.4. DEFINITION. A group G is universal locally finite if 
(1) G is locally finite, i.e. every finitely generated subgroup is finite. 
(2) An isomorphic copy of every finite group is embedded in G. 
(3) If G,; and G, are isomorphic finite subgroups of G, then there is an 
inner automorphism of G mapping G, onto G2. 


It is known that there are universal locally finite groups of each infinite 
power x. Any locally finite group of infinite power x can be embedded ina 
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universal locally finite group of power A for each A =x. All countable 
universal locally finite groups are isomorphic. 

The theory of universal locally finite groups G is a sentence T of the 
language L.,,.. The sentence T is obtained by 

(i} enumerating the countable set of multiplication tables Dg, of finite 
groups G, and asserting that every finite subset of elements of G satisfy 
one of these tables and that for every Dg, there is a finite subset of G which 
satisfies it, 

(ii) asserting that if two finite subsets of G each satisfy the same Dg, 
then there is an inner automorphism of G carrying one of the subsets onto 
the other. 


5.5. THEOREM. There are non-isomorphic universal locally finite groups in 
every uncountable power «. 


Proor. The theory T of locally finite groups has been shown to be a 
sentence of L.,.. Since it is known that locally finite groups exist in all 
infinite powers, the conclusion of Theorem 5.2 applies to T. We will find a 
locally finite group I’ of power w, which contains w, different types with 
respect to some countable subset C CI. The group I" is embeddable in a 
universal locally finite group of any power k = @,. 

Let S; be the symmetric group on three elements. S; has two generators 
a and B, a’=1, B*=1, aB# Ba. The (complete) direct product H* of x 
copies of any finite group is locally finite. Hence S? is locally finite. The 
group S¥ contains the countable subset 


C ={a;: i€ w, a(n)=a if n =i, a(n) =1 if nF i}. 
Choose a set P of w, distinct subsets of w. The group S¥ contains the set C,, 
C, = {e.(n): ¢.(n) = Lif n © X, c.(n) = B if n€ X}. 
Note that C,-a;=a,°C, if and only if iG X. Thus, for X, Y distinct 
elements of P, the subsets of elements of C with which Cy, and Cy 
commute are distinct with the result that, with respect to C, Cx and Cy 
have different types. Let F be the group generated by the elements of C 
and C, in S¥. The group I, as a subgroup of a locally finite group, is locally 


finite, has power w, and contains w, types with respect to the countable 
subset C. O 


We give a brief discussion of w-stable theories, generally omitting 
proofs. 
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5.6. Derinition. A complete theory T is called «-stable if the number of 
one-types realized in any model % of T with respect to an arbitrary subset 
C of A, |C|<x, is at most. x. 


Note. It is easy to see that an equivalent definition results if ‘‘one-type”’ is 
replaced by ‘‘n-type’’, for any n= 1. 


5.7. THEOREM. If T is w-stable, then T is x-stable for all x, «x = w. 


5.8. DEFINITION. Let “be a relation structure, X a subset of A, and S, the 
symmetric group on n-elements. An n-ary relation R is connected over X 
in 4“ if for any sequence (Xxo,...,X,-1) of n distinct elements of X, there is 
ans € §, such that %F R(X, -.-.Xsin-n); it is anti-symmetric over X in M 
if there is s ES, such that AF AR (xX, .... Xstn-1)- 


Note. If %=(X, <), then < is an anti-symmetric connected binary 
relation over X. 


5.9. THEorEM. No model % of an w-stable theory T contains an infinite 
subset X over which there exists an anti-symmetric connected relation R. 


Proor. A variant of Theorem 5.1. 0 


5.10. Derinition. A linearly ordered set (X, <), X a subset of the relation 
structure YI, is homogeneous with respect to the subset B of A if for all 
properly ordered tuples of X, x0<--*<x, and yo<-++'<y,, and all 
formulas ~(vo, U1,..., Uns +++) Unem) Of L, 


MLE @(X0,. 26s Xn Da sty ees Dam) 
if and only if 
ME OC. - 665 Yas On vty e+ +s Onam) 


where the parameters b; are arbitrary elements of B. 


5.11. THEoremM. Let T be an w-stable theory, x a regular uncountable 
cardinal, ‘| a model of T of power x and B any subset of A of power <x. 
Then ‘(contains a subset X of power x which is homogeneous with respect 
to B. 


This theorem is remarkable in that it asserts that the model “ must 
contain a large homogenous set X. Using Theorem 5.9, it is possible to 


cH. A.5, §6] PARTITION THEOREMS FOR LARGE CARDINALS 193 


show that the formulas satisfied by the tuples of X do not depend on their 
ordering. Thus, viewed in the context of the Ehrenfeucht-Mostowski 
theorem 1.1, the models M(®, X) of w-stable theories provide a complete 
analog to the example of free groups discussed in Section 1. 


6. Partition theorems for large cardinals and the constructible universe 


We introduce a partition theorem that will enable us to deduce the 
existence of an E.-M. set ® such that M(®, X) is well ordered if X is well 
ordered. This condition on @ is necessary if we wish to apply results about 
the models “(®, X) to set theory. 

The notation «x <*, x a cardinal, « = {A, A an ordinal < x}, refers to the 
set of all properly ordered finite sequences of kx. 


6.1. DEFINITION. A subset Y of « is homogeneous for a partition P of «*° 
into disjoint sets if Y“ is contained in one partition class for each n. The 
partition class may depend on n. 


In this section our attention will center on cardinals « such that 
«k > (w,);°. The existence of a cardinal x with this property cannot be 
deduced from the axioms of ZFC. However, it does follow from these 
axioms that the first such cardinal is weakly compact, strongly inaccessible, 
and, indeed, larger than the least cardinal satisfying any of the combinato- | 
rial properties mentioned in the body of Chapter B.3. The following 
theorem is due to Rowbottom. A proof can be found in Morey [1968]. 


6.2. THEOREM. If k > (w,)r*, then Kk > (@,)>.”. 


6.3. THEOREM. Assume that x > {(w,);° and that the theory T has a well 
ordered model % of power x. Then % contains a homogenous set Y of power 
w, and the EM. set ® satisfied by the properly ordered tuples of Y has the 
property that if (X, <) is well ordered, the model M(®, X) is well ordered. 


Proor. Assume that T is a Skolem theory and % a Skolem model of T. 
Denote the well ordering of {1 by <. The countable set of formulas of L* 
induces a partition of 2“° into 2° classes. There is a subset Y of power w, 
of & which is homogeneous for this partition. Let ® be the set of formulas 
satisfied by the properly ordered tuples of Y. 

Assume for some well ordered (X, <) that “(®, X) is not well ordered. 
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M(®, X) contains a countable descending chain z9>>2,;>°:'>2, >°'°°. 
The sequence {z,,i€w} has a countable support CC X; every 2 is 
represented by some term t(xo,..., Xn), %1 © C. Since C is a countable well 
ordered set, there is an order preserving map f of C into the homogenous 
set Y of power w,, Y C A. The map f induces an elementary monomor- 
phism of M(®, C) into M(®, Y). M(®, Y) is an elementary submodel of 
A; % therefore contains a countable descending chain fzy> fz,--: > 
fz, +++, contradicting the fact that < well orders A. 


6.4. THEOREM. If the theory T satisfies the hypotheses of Theorem 6.3, then 
we can find an E.-M. set ¥ for T which, besides satisfying the conclusion of 
Theorem 6.3, has the property that for every Skolem term t, Y contains the 
formulas 


(1), (Vo, «+5 Um en) S Um > t(Vo,.- +, Umen) 
= £(o,.--5 Umy Um tnt) ++) Ums2n)s 
(2), Un < t(Vo,-- 65 Un—ty Unttg e+ +9 Um) Vas 
= t(Vo,---, Un—ty Untty oo ey Um) 


Proor. Consider the model “(@®,w,) where ® is the E.-M. set whose 
existence is asserted in Theorem 6.3. Let P be the set of all homogenous 
subsets Y of power w, contained in “(®, w,) and let F = {A; A is the w-th 
element of Y with respect to the well ordering < of “, Y © P}. The 
minimum A, of [I exists and is an element of F. Let ZEP be a 
homogenous set whose w-th element is A, and let ¥ be the E.-M. set 
satisfied by properly ordered tuples of Z. O 


It is an exercise to show that if % did not include the formulas (1), or (2) 
with respect to some Skolem term ¢, then ¢ would give rise to set of power 
w, homogenous for (®, w,) whose w-th term was strictly less than A.. 

Henceforth we assume familiarity with the usual axiom systems for set 
theory as discussed in Chapter B.1. 

Suppose T is the usual set of axioms for Zermelo—Fraenkel set theory 
with choice (ZFC) plus the axiom of constructibility V=L. If T has a 
model whose ordinals have order type A, A = x, k —(w,)>°, Theorem 6.3 
asserts that %f contains a set of ordinals Y of power w, which are 
homogenous for %. In this case, it is possible to prove the existence of an 
E.-M. set ¥, called remarkable, which contains in addition to the formulas 
(1), and (2), of Theorem 6.4, the formulas 
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(3). t(Vo,..-5 Uni) < Dn 


for each Skolem term ¢ of L* 

For a remarkable E.-M. set ¥, the ordinals of the models M(¥, «), « an 
uncountable cardinal, are not only well ordered but actually have order 
type «x. Thus, by the Gédel isomorphisms theorem, M(¥, «) is actually 
isomorphic to ({F(@), a < x}, €) when F(a) enumerates the set construct- 
ible at level a. 

Using the reflection principle, it is easy to show that M(¥,«), « an 
uncountable cardinal, is actually an elementary subsystem of the construct- 
ible universe. In particular, M(¥, ,) is an elementary subsystem of the 
constructible universe. 

From the fact that ¥ is a remarkable E.-M. set, it follows that if a isa 
limit ordinal and B > a, the ordinals of M(¥Y, a) are an initial segment of 
the ordinals of M(¥, B). Thus the isomorphism between M(W,w) and 
({F(a),a<w,},e) carries M(¥%,w) onto ({F(a),a < B},e) for some 
countable limit ordinal 8. But since M(¥, w) < M(¥, w,) and M(Y, w,) is 
an elementary subsystem of the constructible universe L, it follows that 
({F(a), a < B},e) is also an elementary subsystem of the constructible 
universe. 

We are now in a position to draw some conclusions about the extent of 
the constructible universe L if the existence of a cardinal x, k — (,);° is 
assumed. 

A definable element of the constructible universe L is necessarily a 
definable element of each of its elementary subsystems. In particular, every 
definable set of L is an element of the elementary subsystem ({F(a),a < 
B}, «). But then, as viewed from the outside, every definable subset of L is 
countable. Thus the set of constructible subsets of w is countable as seen 
from outside of L, though within L it has power a. 

Using the techniques of Theorem 5.2, it is possible to show that the set of 
constructible subsets of any cardinal « has power « as viewed from outside 
of L. Of course, within L, the power set of x is of power 2”. The 
constructible universe is small indeed and diverges sharply from the ‘‘real”’ 
world provided, of course, in this real world there is a cardinal x, 
kK > (@,)2°. 

Recall that the formulas of the language of ZF can be effectively coded 
into the natural numbers. The set of integers corresponding to the 
remarkable E.-M. set Y is called 0” in the literature. Of course 0” is not a 
constructible set. The existence of the large cardinal x, k — (w,)>°, implies 
that a relatively simple set of integers 0” is not constructible. On the other 
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hand, if we assume the existence of the countable combinatorial property 
that the set of integers 0” exhibits, we are led to the conclusions concerning 
the size of the constructible universe. The existence of 0%, a countable 
combinatorial property, places sharp bounds on the extent of the con- 
structible universe. For more on this subject see Si-ver [1973]. 


References 


CHANG, C.C. and H.J. KEISLER 

[1973] Model Theory (North-Holland, Amsterdam). 
Macintyre, A. and S. SHELAH 

{to appear} Uncountable universal locally finite groups. 


Mor ey, M. 
[1967] Partitions and models, in: Proceedings of the Summer School in Logic, Leeds 1967, 


edited by M.H. Léb, Lecture Notes in Mathematics, Vol. 70 (Springer, Berlin). 
SILVER, J. 
[1973] The bearing of large cardinals on constructibility, in: Studies in Model Theory, 
edited by M. Morley (Math. Assoc. Am., Buffalo, NY). 


A.6 


Infinitesimal Analysis of Curves 


and Surfaces 


K. D. STROYAN 


Contents 


1. 


Ae wn 


Robinson’s formulation of Leibniz’ Principle . 
. Elements of infinitesimal calculus . 

. Continuous curvature and differentials . 

. Kissing curves . . . . . . 

. Gauss’ investigations of surfaces. 


References . 


© North-Holland Publishing Company, 1977 


197 


198 
208 
217 
221 
225 
230 


HANDBOOK OF MATHEMATICAL LOGIC 
Edited by J. Barwise 


198 STROYAN/INFINITESIMAL ANALYSIS cH. A.6, §1] 


1. Robinson’s formulation of Leibniz’ Principle 


The intuititive notions of infinitely large and infinitesimally small num- 
bers appeal to most people. These notions in one form or another were 
fundamental concepts in differential and integral calculus for about two 
hundred years after its invention by Newton and Leibniz. The notion of an 
infinitesimal number was recently put on a sound basis and in a way which 
retains the intuitive appeal of such numbers without contradiction. In this 
article we give some basic applications of infinitesimals in the study of the 
geometry of curves and surfaces. This provides rigorous arguments in the 
clear geometric style of Gauss. Moreover, we feel! this approach hints at an 
insight of Riemann and others which may seem surprising to the modern 
mathematician, namely, that uniform curvature conditions are a very direct 
way to approach the now-common “C'’’ assumptions. The uniform 
smoothness conditions for the first derivative are so simple and natural that 
they could easily be used at a freshman level avoiding a certain amount of 
technicality in the usual approach. Finally, modern differential forms can 
be obtained by a factorization in the infinitesimals providing a direct link to 
Gauss — the infinitesimal approach can be automatically ‘‘modernized”’. 

Introduction of infinitesimals into mathematics in the late seventeenth 
century was a source of great controversy, one which was not resolved 
mathematically until 1960. The resolution may not yet be widely under- 
stood. In the late eighteenth century, d’Alembert attempted to base 
calculus on the notion of limit in order to resolve the difficulty. In the late 
nineteenth century, thru the research and influence of Weierstrass, the 
‘“epsilon—-deta’”’ method, akin to the ancient Greek ‘“‘method of exhaus- 
tion”, became the fundamental “‘limiting” technique of analysis. Bolzano 
had proposed the method some fifty years earlier, but was largely ignored. 
Cauchy’s systematic use of limits, which he defined using infinitesimals, 
provided a link between the position of De Hopital, the author of the first 
calculus text who treated infinitesitnals as a metaphysical fact, and that of 
Weierstrass. The ‘‘epsilon—delta’”’ method reigned so supreme in the years 
following Weierstrass, that there seemed to be a feeling that infinitesimals 
could not be treated consistently. Leibniz’ approach ultimately was vindi- 
cated. 

In 1961, Abraham Robinson gave a precise formulation of the 
metamathematical principle which Leibniz proposed to govern the in- 
finitesimals (ROBINSON [1961]). Leibniz considered the infinitesimals to be a 
“useful fiction” rather than a metaphysical fact but was concerned with the 
rules which they obey. He stated the principle that the “ideal numbers’”’ 


cH. A.6, §1] ROBINSON'S FORMULATION OF LEIBNIZ’ PRINCIPLE 199 


were to have the same properties as the “‘finite numbers” and vice versa, 
but he was not specific as to which properties this was to apply. For 
example, there is a clear contradiction with the completeness axiom or the 
well ordering of the natural numbers. The modern logician will recognize 
these examples as troublesome “‘higher order’? formulas — ones with 
quantifiers over sets rather than only numbers. The Archimedean axiom 
provides an even more compelling problem to which Leibniz did not 
address himself. The tools of modern model theory, which studies the 
relationship between formal languages and mathematical structures, pro- 
vide a framework in which Robinson could formulate Leibniz’ principle. 

A formal language of ordinary analysis has an interpretation in a 
nonstandard model by virtue of the Gédel Completeness Theorem or the 
Compactness Theorem (see Proposition 2.7 in Chapter A.1, for example). 
Nonstandard extensions also can be exhibited by the ultrapower method 
discussed in Chapter A.3. 

A simple algebraic theorem says that once one has a proper ordered field 
extension of the real numbers, the extension is non-Archimedean, that is, 
contains infinitesimals and their infinite reciprocals. Moreover, a formal 
sentence whose interpretation in the standard model is true has a true 
interpretation in the nonstandard model and vice versa. This is the rule of 
Leibniz, but there is a difficulty — why doesn’t the Archimedean property 
hold in the nonstandard model contrary to our algebraic reasoning above? 
Or more technically: How can you apply the Compactness Principle to a 
higher order language? The answer is that Robinson uses a first order 
language for higher order formal analysis and the interpretation of higher 
order sentences in the nonstandard model is subsequently weaker than 
the informal meaning. But how can a first order theory be useful in anal- 
ysis ...? 

We shall sketch the idea here and reinforce it with examples. The only 
formal subtlety is the care one must exercise with quantifiers. First of all, 
“for all x’’ and “there exists y’’ are not permitted — only bounded 
quantifiers are allowed, e.g., ‘for every positive « in R” and ‘‘there exists 
an element n in N’’. The real tricky part of this in the nonstandard model is 
that quantifiers over sets are specified to run over a particular set of sets. 
All the sets which come up in classical analysis have nonstandard exten- 
sions by a map denoted ‘‘*’’. One obtains Leibniz’ transferred property by 
applying * to each set in a sentence — this is called the *-transform. First 
order properties transfer identically, but higher order properties have 
restrictions on their quantifiers — only internal sets lie in the scope of a 
transformed quantifier. 


200 STROYAN/INFINITESIMAL ANALYSIS [cu. A.6, §1 


One suitable definition of ‘‘all the sets which arise in classical analysis”’ 
begins by defining a superstructure over a set of atoms (or urelements) 


Xo, an infinite set of atoms 


Xia = 9 y x,), k EN, 
n=0 
finally, 

= U Xx. 


kEN 

We shall begin by looking at the case where X, = R, the real numbers (as a 
set of atoms). Later we need to construct a superstructure Y on another set 
of atoms. Real valued functions are elements of %, since functions ‘“‘are’’ 
sets of ordered pairs and ordered pairs “‘are”’ sets {{a},{a,b}}. The 
euclidean spaces R" and all their subsets are also in % as elements. The 
classical spaces ¢7(N), L?[0, 1], H*(U), and so on, all are elements of @. 

Elements of &% are called entities and the set theory of entities provides a 
framework for classical analysis. The algebraic object (2%, €) is called a 
superstructure. The bounded first order language of € describes the set 
theory of entities where we always have 


Vx[XEAD---] or Ay[yEeB&:-:] 


with A and B entities of 2. We have one constant in our language L(€ ) 
for each element of 2. These constants in turn have interpretations back in 
&#, for example, p of L might stand for R € # and we name the standard 
interpretation map “I”, I(p)=R in this case. We may think of any 
property of analysis expressed in terms of set theory and thus in terms of 
the relations © and = of the superstructure. This is what we apply 
nonstandard extension to. 


1.1. Lerniz’ Principle. There is a set of atoms Yo and a mapping 
*:#—>Y, Y being the superstructure based on Yo, with the following 
properties: 

(i) *R = Yo, the map applied to the ground set entity in % is the ground set 
in Y. 

(ii) The extension is proper, the hyperreal numbers *R properly contain the 
standard embedded reals "R= {*r: r € R}, in particular, there is a nonzero 
infinitesimal in *R. 

(iii) Every sentence about % with a bounded formalization in the language 
of € holds in & if and only if the *-transform obtained by extending each 
constant of the sentence holds in the range *2. 
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Proor. Apply the Compactness Theorem (Theorem 2.4 in Chapter A.1) to 
the set of sentences of L(€) whose interpretation in # is true plus the 
infinite set of sentences saying c € I~'(R) but {c 4 I~'(r): r € R}, where “‘c” 
is a new constant which has not already been used. Altogether these 
formulas are finitely satisfiable, thus have a model YW. (Alternate to 
Compactness we may apply the Fundamental Theorem on ultraproducts 
from Chapter A.3 to obtain the same result.) 

Now we apply the Mostowski collapsing of the well-founded portion of 
W (see 3.7 in Chapter A.7 for example). This gives us a ground set *R as the 
collapsed interpretation of the constant ~'(R). The rest of the collapsing of 
W lies inside the superstructure on *R. Bounded formulas only refer to this 
part, so our theorem is proved. Collapsing is shown in Fig. 1. 


Fig. 1. A picture of *2% showing “infinite types” chopped off and external part of %. 


We view the nonstandard model *2 of our language as a proper 
elementary extension of % (with respect to the bounded quantifier for- 
mulas) embedded in the superstructure Y based on the hyperreals *R. It is 
important that we embed * 2, that is, have “‘real’’ € as the interpretation 
in the nonstandard model, in order to be able to deal with both internal and 
external sets. All the new constructions will be based on external sets, yet 
involve important interactions with internal ones. OJ 


Now we give some examples and consequences of Leibniz’ Principle. A 
first order example is commutativity of addition: 
{Vx Vy[(x € p&y Ep) > (x+y=y+x)}}, where I[(p)=R, 
‘for every x and y in the reals R, x + y equals y + x”’. The *-transform is: 
I'fWx Vy [(xE p&y Ep) > (x+y=y+x)]}, where I'(p)=*R, 


“for every x and y in the hyperreals *R, x + y equals y + x’. In a like 
manner, all the first order ordered field axioms transfer to *R. Notice that we 
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have extended addition, which can be thought of as a ternary relation 
S={(x, y,z): x+y =z; x,y,z © R}. It is customary not to write a star on 
operations like +,-, <=, or on standard functions, since the extension 
agrees on standard points. 

The Archimedean axiom, “for every positive real x, there is a natural 
number n, such that x >1/n’’, 


{Vx dn[x ER > [nEN&[x > 1/n]]]}, 
transforms to 
I'{Wx dn[x €*R > [n © *N& [x > 1/n]])}, 

by abuse of notation involving the constant for R and *R, Nand *N, the less 
than, and division, ‘“‘every positive hyperreal x is smaller than 1/n for some 
positive hyperinteger’’. By being forced to extend the bounds on the 
existential quantifier we do not obtain Archimedes’ axiom, but rather that 
each infinitesimal is more than the reciprocal of some infinite integer of *N, 
the extension of the set of natural numbers. The numbers 1, 2, 3, etc., from 
N each have a unique extension in *N, we call the embedded standard set 
“N= {*n: n © N} and usually do not bother to write *1, *2, etc., for familiar 
numbers. The point we wish to make is that 


*N. = *N-7N#AQG, 
there are infinite numbers in *N. In fact, every infinite set A from classical 
analysis is properly extended: 
*A-"A#Q, 


where *A is the extension of the set. 
The set 
"A ={*a:a€ A} 


is the (externally) embedded standard set. We explore this further in the 
next example. 


1.2. DeFInition. The infifitesimals are given by: 

o ={x © *R: |x|<1/n for each n € 7N}. 
We write x ~ y if (x — y)€o and say x is infinitesimally close to y. The 
ring of finite numbers is given by: 

0 = {x € *R: |x |< n for some n € °N}. 


The infinite numbers by *R. = *R—- 0. 
Besides ‘“‘infinitesimally close’? we may also say “‘infinitely near’ or 


cH. A.6, §1] ROBINSON'S FORMULATION OF LEIBNIZ’ PRINCIPLE 203 


simply “nearly”. We say a hyperreal b is ‘“‘near-standard”’ if there is an 
a€/’R with a~b. 


Since *N—7N ¥ O there exist infinite integers 2, 2 + 1,3- 2,07, 2°. All 
these are different by Leibniz’ Principle, for example, “for every x € R, 
x+1Ax” says 2+1# 2, and so on. (Observe therefore that neither 
cardinal nor ordinal numbers in Cantor’s sense can serve as *N.) The 
reciprocals 1/0, 1/N° are infinitesimal. 

One can summarize the properties of finite and infinitesimal numbers 
algebraically by: 


1.3. THEOREM. © is an ordered ring containing o as a maximal order ideal 
and ©/=~ is order isomorphic to R, that is, each finite hyperreal is 
infinitesimally near its standard part in °R. We denote the canonical 
homomorphism by “‘st’’, 


st 
o> O0—R, 
or sometimes, G = st(a). 


We shall omit the proof, see RoBinson [1966] or STROYAN and LuxEm- 
BURG [1976]. One important observation from this is the fact that a finite 
number times an infinitesimal is infinitesimal. The theorem gives a useful 
picture, which KEIsLeR [1976] calls the Infinitesimal Microscope (see Fig. 2). 


Fig. 2. Infinitesimal Microscope: “a” is the only standard number in the field of view. 


Two points in *R are finitely far apart if (x — y)€ ©. For example, the 
integers 2,02+1,2+2, and Q+n for n€°N are finitely far apart. 
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Notice that Q/n for n © °N is infinite and infinitely far from Q as isn-2Q 
and 2". The Infinite Telescope says that each galaxy or finite equivalence 
class looks like a copy of 0. This follows from Leibniz’ Principle applied to 
“translation is an isometry” plus the last theorem to describe 0. 

Our final example of a *-transform is as follows. Completeness of R says 
“every set B in the power set of R which is bounded above has a least 
upper bound”’, 


I{VB((BE A(R) & Ay ly ER&Vx[xEBS>x<yl)/> 
Sab[bDER&Vz[zER>Z> [Vw[wWEB > w<z]) > bsz)j]}}. 


The constants of the formal sentence are really I~'(A(R)) and I~'(R), 
I-'(<) and I-'( = ). The *-transform says that every set B € *[A(R)] which 
has an upper bound has a least upper bound. Not every subset of *R arises 
in *P(R); we have the following inclusions, all strict: 


°P(R)C*P(R)C P(*R). 


In other words, *-completeness is not the same as completeness in *R. The 
set of infinitesimals has no least upper bound, since half a finite number is 
still an upper bound and twice an infinitesimal exceeds the original (using 
Leibniz’ Principle on the sentence ‘“‘for every positive x ER, 2x >x’’). 
Hence, 

o € *P(R), 
of course, 

o € A(*R), 


the infinitesimals are a subset of the hyperreals. 


1.4. Derinition. If A is an entity in %, both A and its extension *A are 
referred to as standard sets. The standard sets are elements of "P(X, ) for 
some k EN. 

Elements B € * of standard sets *s are said to be internal. For 
example, internal subsets of *R are the elements of *P(R). The set 


{x € *R: x = N}=([N, 0) 


is an internal set in *P(R) and not in “A(R) when 2 is infinite. This is 
internal by Leibniz’ Principle applied to the statement “for every a ER, 
[a,«©)€ A(R)’. The internal sets are the ones which the formal language 
“knows about”; any formal property referring only to internal sets itself 
describes an internal set. The symbol ‘‘o”’ is not a hyperreal number. 
The remaining sets in the superstructure based on *R are called external. 
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We proved that the infinitesimals are external, that is, o€ 
Y(*R)—(*PA(R)). Observe that existence of external sets in our model 
requires us to embed *Z in set theory (or deal simultaneously with internal 
sets in two set theories). This is the difference between an infinitesimal 
analysis nonstandard model and the standard nonstandard models of set 
theory (which need not be ‘‘well founded’). 

The space of continuous functions C[0, 1] extends to the *-continuous 
functions *C{0, 1], that is, Leibniz’ Principle says each f € *C[0, 1] is a real 
valued function on the set {x € *R:0< x <1} = *[0, 1]. The *-continuous 
functions satisfy the *-transform of epsilon—delta continuity. 

The interaction between internal and external sets is what makes 
Robinson’s theory fruitful. As a final example, consider the internal 
function f(x) = sin(ax) for infinite A € *N.. We know f is *-continuous: 
“for every n & *N, sin(anx) € *C[0, 1]”, Would you say it is continuous in 
the standard sense? What is f(0)? f(32)? 

We will use this formulation of ‘ideal numbers”’ to sketch Infinitesimal 
Calculus 4 la Leibniz-Robinson and develop Infinitesimal Geometry 4 la 
Gauss afterward, but first we close the section by trying to place Robin- 
son’s contribution in historical perspective. We strongly encourage our 
reader to consult, in Rosinson [1966], Robinson’s own account of the 
history of calculus. It is our belief that Robinson’s solution of Leibniz’ 
problem stands as one of the major results of twentieth-century mathe- 
matics. After nearly three centuries, Robinson put the original intuitive 
formulation of calculus on as sound a basis as the rest of modern 
mathematics. It is our belief that infinitesimal methods will prove useful in 
modern pure and applied mathematics as well, there is strong evidence that 
infinitesimals and *-finite sets can aid the art of invention by making 
various complicated limits more clear and intuitive. 

While Leibniz felt that the ideal numbers ought to have a consistent 
treatment he was both aware of contradictions and stated that infinitesimal 
arguments could be replaced by arguments “‘in the style of Archimedes’. A 
precise formulation of the latter assertion is an open problem, see 
Rosinson [1973], problem eleven. As to the inconsistency, Newton’s 
fluxions also suffered. BERKELEY [1734] gave an excellent account of the 
difficulties associated with throwing away “higher order terms’? and 
maintaining equality (see also NEWMAN [1956]). This led him to call 
infinitesimals “‘ghosts of departed quantities”. Berkeley showed that the 
fluxions were no more consistent. With hindsight we cannot help but 
wonder why the equivalence “‘is infinitesimally close to’? was not distin- 
guished from “‘is equal to’’. Besides the infinitesimal relation ‘‘ ~ ” of (1.2) 


* 
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above, we shall see that relative infinitesimals on a scale of another 
infinitesimal play a fundamental role in Gaussian infinitesimal geometry. 
These correspond to throwing way ‘“‘second order terms” and keeping 
“first order terms’”’ 

Leibniz’ methods brought about tremendous development of calculus 
and to some extent for a time the results overshadowed the underlying 
inconsistencies. Nevertheless, Euler, for one, checked his results by 
numerous methods because of his openly expressed scepticism. His great 
care and insight is reflected in the fact that his proof of one of his most 
fundamental lemmas, the product formula for sin(x), has been given a 
translation into Robinson’s theory of infinitesimals and stands as correct! 
(See LuxemsurG [1973] and Stroyan [1976].) 

Both Lagrange and d’Alembert attempted to banish infinitesimals and 
solve the foundational question around the end of the eighteenth century. 
By this time Leibniz’ assertion that infinitesimal arguments could be 
rephrased into the style of Archimedes was not so obvious, but even after 
Weierstrass introduced ‘“‘e-6’’ Riemann considered both methods correct, 
though Weierstrass’ the more “‘concrete’’. Besides intrinsic interest in the 
foundational question, analysis was sophisticated by the late nineteenth 
century, so sharp distincitions between convergence and uniform con- 
vergence and between continuity and uniform continuity were needed 
especially for trigonometric series and the calculus of variations. It is 
interesting to note that Riemann equated uniform continuity with Cauchy’s 


x~y implies f(x) f(y) 


and that in 1853 Cauchy was forced to modify his infinitesimal definition of 
convergence from 1821 to a form which excluded a counterexample of 
Abel from 1826. Cauchy had proved that a series of continuous functions 
has a continuous limit. RoBinson [1966], p. 273, shows that the revised form 
is equivalent to uniform convergence and indeed gives a correct result. We 
shall have use for a related simple notion of ‘“‘uniformly differentiable’, 
which is equivalent to ‘“‘continuously differentiable’ and which shows that 
Gauss’ definition of a surface with ‘continuous curvature” actually means 
““C'-embedded”’ in modern terminology. The “‘surprisingly” stronger prop- 
erties obtained by varying two things infinitesimally is one strength of 
Robinson’s method. 

About the same time that Weierstrass established “‘e—6”, Cantor 
created his theory of cardinal and ordinal numbers, again being motivated 
to an extent by questions from trigonometric series. Cantor’s infinite 
numbers violate Leibniz’ Principle so that the infinite numbers of Robinson 
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are a completely disjoint theory, though this fact certainly implies no 
conflict. The great progress in measure theory and topology brought about 
by Cantor’s theory no doubt has contributed to modern disfavor for 
infinitesimals, but this success does not imply that there is only one kind of 
infinite number. We hope the mathematical community will recognize the 
existence of (at least) two kinds of infinite numbers. 

The author is not qualified to comment on the philosophical implications 
of Robinson’s contribution, but refers the reader at least to the closing 
remarks in Rosinson [1973]. 

In the years since Robinson’s vindication of infinitesimals, his ideas have 
been developed in several ways. KeIsLer [1976] has given a simple 
axiomatic development of infinitesimals suitable for beginning calculus 
students. Keisler’s approach to calculus largely avoids formal logic and 
provides a clear consistent introduction to calculus with the strong intuitive 
flavor infinitesimals have always had. His book also treats ‘‘epsilon-delta” 
where it arises naturally and essentially — in numerical approximation. 
Moreover, students’ computational skills are enhanced by having infinitesi- 
mals. JENSEN [1972] has given a computer-oriented version of infinitesimal 
calculus where infinitesimals are infinitely accurate computations of an 
ideal machine. The investigation of SULLIVAN [1974] uses Keisler’s ap- 
proach for a broad spectrum of students and indicates that it is an exciting 
and practical educational endeavor. 

Many researchers, especially Robinson himself, have shown how in- 
finitesimals play a role in modern algebra and analysis. Two monographs, 
RoBinson [1966] and Stroyan [1976], treat a wide spectrum of analysis 
from the point of view of infinitesimals. One important part of this research 
is in seeing how topics invented since ‘‘epsilon—delta’”’ can be treated 
infinitesimally; this process frequently provides a new clear perspective. 
For example, in 1969 Behrens cast recent work on bounded holomorphic 
functions in terms of infinitesimals and subsequently made remarkable 
progress on the study of analytic structure of maximal ideals and on the 
Corona Problem. In Benrens [1974a] Behrens sketches the infinitesimal 
approach to that work and makes interesting new conjectures. There are 
other examples of this in the monographs cited above and in three 
symposium publications, LuxemMBuRG [1969], LuxEMBURG and ROBINSON 
[1972], and Hurp and Loes [1974], as well as the notes of MACHOVER and 
HIRSCHFELD [1969] and the mathematical literature (JOHNSON [1975]). We 
are certain that experts in many fields would find the process of reformulat- 
ing open problems in terms of Robinson’s infinitesimal foundations 
rewarding as well as fun. Since infinitesimals frequently simplify compli- 
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cated “‘limiting procedures’’, sustained hard work may also show that a 
formerly unsolvable problem lies within our grasp. We expect that process 
to require insight and effort, but we also expect that in many cases it will 
lead to significant progress if seriously attempted. 


2. Elements of infinitesimal calculus 


We begin our development of calculus by showing that ‘‘C'”’ can be 
defined directly in a simple fashion. The ‘“‘epsilon-delta” version of this 
theorem is ‘‘well known” but inadequately appreciated, in our opinion. In 
fact the property of uniform differentiability underlies a great deal of both 
differential and integral calculus. We begin with a global one-dimensional 
version to focus on the primary ingredient. 


2.1, THEOREM. Let f:R—>R be a standard real valued function. The 
following are equivalent: 

(i) There is a standard map Df: R—R such that whenever x is finite in 
*R and whenever 8z is a nonzero infinitesimal in *R, 


& [f(x + 82)~ f(x) ~ Df). 


(This is called uniform differentiability.) 
(ii) For each standard a €’R, there is a finite number A, such that 
whenever x ~y ~a in *R, x y, 


£)- fF) _ 4 


y—x 


(This is called local uniform differentiability at a.) 
(iii) The function f is continuously differentiable in the epsilon—delta sense. 


Condition (i) will be most useful to us in infinitesimal geometry. It is also 
the main ingredient in the proof of the Fundamental Theorem we give; that 
proof is essentially the same as Cauchy’s. The same idea underlies the 
change of variables, divergence, de Rham’s and other theorems that 
require ‘‘C'” assumptions. It is closely related to the local flow or 
infinitesimal -tranformation of a ‘‘C'” vector field as well (thru the 
“C'-Picard” theorem). 

The picture for condition (ii) under an infinitesimal microscope is shown 
in Fig. 3. 


cH. A.6, §2] ELEMENTS OF INFINITESIMAL CALCULUS 209 


xQ 


Fig. 3. Uniform differentiability at a under infinitesimal microscope. 


Condition (ii) is a weaker requirement in the sense that we could ask that 
it is satisfied at a single point a€’R. BEHRENS [1974b] and Nuenuuis 
[1974] show that this condition at a single point can replace the C' 
assumption of the inverse mapping theorem. Also, uniform partial deriva- 
tives imply uniform total derivatives. Condition (ii) is perhaps easiest to 
give an epsilon-delta formulation to — we leave that as an exercise. 

The function x’sin(a/x) is not uniformly differentiable at zero. When 
y =1/(2 -5) and x = 1/0, Ay/Ax ~ +2, Df, = +a, and Df, =0, as the 
reader can verify by calculations using Leibniz’ Principle. This example 
points out dramatically that it does not suffice to check the conditions for a 
single value of 5z or only for x = a (but this is a small price to pay for 
consistent infinitesimals). The reader is also asked to draw g(x)=|x| 
under an infinitesimal microscope at x = 0. 

Notice that by direct calculation, (f(5z)— f(0))/6z ~0 for all infinitesi- 
mal 6z in this example (see Fig. 4; zero is not shown this figure). Compare 
this with Theorem 2.2. 


x2 


Fig. 4. Infinitesimal secants and tangents of x? sin(z/x). 
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PROOF OF THEOREM 2.1. Let x be given infinitesimally close to a € “R. First 
we show that (i) implies Df, =Df.. Let z be a finite number and 
6 =|x —a| while 5z is ‘5 times z”’. Condition (i) at x says: 


f(x + 6z)— f(x) =Df, -6z + 8-7 


for some 7 ~0. We may also apply the condition at a to obtain: 


f(x + 6z)— f(a) + f(a)— f(x) = Df, (x + 6z — a)—Dfa (x — a) +8 -¢ 
=Df,-6z+6-g with ¢~0. 
Now we subtract these two equations and divide by 6z to see that 
Df, ~ Df.. 


Condition (ii) follows from this since 


10)= 10) ~ py, = 


and Df, is necessarily a standard number since it is the value of a standard 
function at a standard point. This shows that (i) implies (ii). 

We divide the proof that (ii) implies (iii) into several steps. Theorems 2.2 
and 2.3 are proved below. 


2.2. THEorEM. Let f be a standard real-valued function and aE °R. The 
following are equivalent: 

(i)(in *R) There is a standard constant b such that whenever x ~ a in *R, 
then (f(x)— f(a))/(x — a) ~ b. 

(ii)(in R) The function f is defined on a neighborhood of a and there exists 
a number b ER satisfying the condition that for every « ER", there isa 
5 € R* such that whenever |x — a| < 6, then |(f(x)— f(a))(x —a)- bl <e. 


Theorem 2.2 shows that f is epsilon-delta differentiable using condition 
(ii) with one number x or y set equal to a and the constant b = st(A,). 
Moreover, we may use this to (externally) define a standard map Df, = 
st(A.). (The definition is external in the sense that it only gives values of Df 
for standard values of a.) 

We may in turn apply Leibniz’ Principle to condition (ii) of 2.2 and the 
extension of the function x » Df, and a fixed standard e« >0 to see that 
“there exists a 6€*R* such that whenever |x —-y|<6, then 
(f(y) — f(x)y - x) - Df. |< e”. If we select x =~ a €°R and ¢ ~0 and 
also apply condition (ii) we may conclude that Df, ~ Df. (Notice that this 
proves (ii), implies (i).) 
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2.3. THEOREM. Let g be an internal real-valued function and a E°R. The 
following are equivalent: 

(i) g(x) = g(a) whenever x = a. 

(ii) The function g is defined on a standard neighborhood of a and for 
every « © °R* there is a 5 €°R* such that whenever |x —a|< 6, then 


|g(x)- g(a)|<e. 


Robinson calls these equivalent conditions S-continuity at a. 

Theorem 2.3 shows that Df is epsilon—delta continuous at each standard 
real number, letting g = *Df = Df, by the convention not to put *’s on 
functions, and applying Leibniz’ Principle to pull the condition back to the 
standard model, that is, if ¢ € R*, then 


{5 € R*: whenever |x — a|< 6, then |Df, — Df. |< e} 
is nonempty. 
Modulo Theorems 2.2 and 2.3 we have shown (i) © (ii) > (iii). Condi- 
tion (iii) implies (i) since by continuity and Theorem 2.3, Df, ~ Df, 
whenever y ~ a. If (i) fails, then 


x [f(x + 6z)-f(x)]}=Df.+6 where 040. 


By the *-transform of the mean value theorem there is a y between x and 
x + 6z where Df, = Df, + 6 and either Df,# Df, or Df, A Df. 

The proofs of Theorems 2.2 and 2.3 share one common feature — they 
push an infinitesimal condition out to a finite amount. A general formula- 
tion of the continuity principle (which we facetiously refer to as Cauchy’s) 
is as follows. We enlarge L(€ ) by adding the full ‘‘diagram’”’ of * 2, that is, 
a formal constant for each internal set. We extend I’ to include these added 
constants. 


2.4. CAUCHY’S PRINCIPLE. If P(x) is a bounded internal formal property of the 
free variable x and if I'{P()} holds for every infinitesimal n, then I'{P(x)} 
holds for all x smaller than some standard 6, |x |< 6. 


This was even generalized to topological spaces by Luxemburg, see 
STROYAN [1976]. 


Proor. As we showed in the first section, the infinitesimals o are an 
external set. The set 


{6 € *R: Wx [|x| <6 D> I'{P(x)}} 
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is internal and by hypothesis contains all the infinitesimals, so it contains a 
noninfinitesimal 6. O 


PROOF OF THEOREM 2.2. (i) implies (ii): Let e € °R* be given. The following 
set is standard since its description involves only standard constants: 


{3 € *R: whenever |x — a|< 4, 


f(x) is defined and foy= Ha) = o| < e| 


Condition (ii) follows from Cauchy’s Principle or alternately from the fact 
that the above set is nonempty in *R, therefore, by Leibniz’ Principle, is 
nonempty in R as well. 

(ii) implies (i): Let x ~ a in *R. We apply Leibniz’ Principle to (ii) for a 
fixed standard positive e and standard 6 from (ii). Since |x — a|< 5 when 5 
is standard we know that the infinitesimal difference quotient is within 
epsilon of b, but epsilon is arbitrary, so (i) holds. O 


An Interpretation of Leibniz’ 
af 
ax (4) 
can now be made as the common standard part of all the infinitesimal 


difference quotients 
f(a + 6x) — f(a) 
5x 


provided they all have a common standard part. (We go beyond this below 
and interpret both df and dx.) 


Exercise. Compute dP/dx, for polynomials using the *-binomial theorem 
on (a + 8x). 


PROOF OF THEOREM 2.3. (i) implies (ii): Since g is internal, the property 
P.() = “whenever |x — a| <n, then g(x) is defined and | g(x)- g(a)|< 
é”’ is also internal. Provided e€ is standard and positive, Cauchy’s Principle 
applies and gives (ii). 

(ii) implies (i): Let x ~ a, so |x —a|<6 whenever 6 is standard and 
positive. Condition (ii) implies then that | g(x)— g(a)|< e for any standard 
positive epsilon, thus g(x)~ g(a). O 
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We reiterate the question of the last section: Is the internal function 
f(x) = sin(w#x) continuous in the standard sense? 


2.5. DEFINITION. Let f:[a,b]—R be a standard function. To define 
Riemann integrals over [a,b] we partition the interval into a *-finite 
number of infinitesimal subintervals a = x9<x,<---<xq=b with 
Xu-y = & forl sk <2, internally choose yy, x«—1) = ye S Xx and compute 
the *-finite sum: 


¥ fw) be 


If all these infinitesimal Riemann sums give nearly the same finite answer 
we call the common standard part the integral, 


[. f(x) dx. 


We elaborate on the definition since it is a little complicated. First, the 
term *-finite means the *-transform of ‘‘finite’’ from the standard model. 
“Finite” in % can be written “there is an n EN and a bijection from 
{k EN: 1=k <n} onto P”, hence a *-finite partition is simply an internal 
bijection from a set {k © *N: 1= k <1} onto a subset of [a,b]. An 
example is given simply by the equal partition 


n=E(b-a)+a, 0skeN. 
The words ‘internally choose y,’’ mean y;, ) is also an internal function 
from 1=k =f into [a, b]. Since the function > of the standard model is 
defined for all finite sets of reals, its nonstandard extension is defined for all 
*-finite sets of hyperrreals, thus the *-finite sum exists in *R. While the 
*-finite sum always exists two *-finite infinitesimal Riemann sums need not 
be infinitesimally close, for example, let f be the indicator function of the 
rational numbers, f(x)=1 if x is rational, f(x)=0 if x is irrational, 
*-irrational y,’s give zero sum while *-rational y,’s give (b — a) sum, the 
integral does not exist. 

It is fairly easy to see that an internal finite S-continuous function will 
actually yield a common standard part for all its infinitesimal Riemann 
sums. For example, if z, and y, are two choices of evaluation points: 


2 feddn ~ 2 f04)-dx| = may, [FG fowl ~ a) 


and the internal *-finite maximum is attained with z, ~ y, so that f(z.) = 
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f(y«). Similarly, different partitions have a common refinement by Leibniz’ 
Principle applied to the statement ‘finite partitions have a common 
refinement’. The sums are close on the common refinement and each 
refined sum nearly agrees with the old one. In particular this shows that 
continuous standard functions are Riemann integrable and for such func- 
tions any infinitesimal Riemann sum can be used to evaluate J* f(x)dx = 
st(2?-1 f(y. Ax). (Well, it does not show the ‘‘epsilon—-delta”’ version of 
Riemann integrable, but our definition is equivalent and perhaps 
Riemann would not have objected to infinitesimals.) 


2.6. THE FUNDAMENTAL THEOREM OF CALcuLus. (i) Let f:[a,b]—R be a 
continuous standard function and let F(x) = fif(t)dt, then (dF/dx)(x)= 
f(x), a=x=b. 

(ti) Let F(x) be a uniformly differentiable function in a neighborhood of 
[a,b], then f° (dF/dx)(x)dx = F(b)— F(a). 


Proor. By Leibniz’ Principle, 


6c min, (O)= [fide = 62+ max, (FO) 


xs tr xt+8z 
and when 6z ~0, min f(t) = f(x) = max f(t), so 


GE gj at (Feseo Fe) 
dx ~ 6z 


no matter which infinitesimal we choose. This proves (i). 
(~ Cauchy): Let {x,:0<=k =} be an infinitesimal partition. By 
condition (i) of 2.2, 


dF F(x) — FOx-1) 
eo eae 


XK Xe-1 
with », ~0 for 1= k = 2. Hence 
n dF 2 n 
Day (Ate = DF (x0) — Fed] + Dm Are 
k=1 OX k=1 k=l 
and 


= max(|m |) ° >) Ax =|1m|-(b-a)~0 


n 
> m1 AX, 
k=1 


while 


(F(a) ~ F(x) = F(6)= F(a) 


by transfer of the formula for telescoping sums. This proves (ii). O 
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We need to cast uniform differentiability in a ‘‘several variable’’ setting 
(a Banach manifold is really no harder). Let MC R" be a subset of m-space 
for some fixed finite n. The metric on R” has an extension to *R’, 


|x-¥1= (3 @!-y'y) 


and we say that the vector X = (x',...,x") is infinitesimally close to the 
vector Y =(y',...,y") when 


[X-Y|~<0 in?R. 


1 
2 


This is easily seen to be equivalent to x’ ~ y’, 1 <j <n. We also might say 
X is ‘infinitely near’? Y or “nearly equal to” Y. A vector B is ‘“‘near- 
standard” if there is a standard vector A such that B ~ A. The vector x is 
finite when | X | is finite in *R. The standard part of a finite vector X is the 
standard vector st(X) = (st(x'), st(x7),..., st(x")). 

We include a few results to indicate how topology of subsets of R” can be 
treated with infinitesimals. Instead of dealing with the monad of a standard 
point, that is, all X € *R" with X ~ A, we deal with the relative monad, all 
xX €&*M with X =A. 


2.7. THEOREM. A standard subset U CM is a topological M-neighborhood 
of A € U if and only if whenever X ~ A, X © *M, then X © *U, that is, 
provided *U contains the relative -*M-monad of A. 


Proor. Apply Cauchy’s Principle to the statement 
“*U contains the internal set B(4)M *M” 


where B(n) = {X € *R": |X — A|< 7}. Then apply Leibniz’ Principle to 
the standard 6 obtained from Cauchy’s Principle, so 


“U contains B(6)1M”, without *’s. 


The converse is immediate from Leibniz’ Principle. O 


2.8. THEOREM. The standard set U CM is M-closed if and only if whenever 
u € *U is infinitesimally close to a standard m ©°M, then m © °U. More 
specifically, a sequence {X,,} converges to A if and only if Xo ~ A for infinite 
2 & *M.. 


For a proof of Theorem 2.8 or 2.9 see Rosinson [1966] or STROYAN 
[1976]. 
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2.9. THEOREM. A standard set U CM is M-compact if and only if every 
point X € *U is near a standard u © °U. 


For example, the open unit interval (0,1) is not compact because 
*(0, 1) = {x € *R: 0< x <1} contains infinitesimals « ~0, but does not 
contain 0 itself. The natural numbers are not compact because *N contains 
an infinite integer 2, not near any standard point. 

We denote the space of linear maps from R™ to R" by Lin(R™,R"). An 
internal linear map is an element of *Lin(R”,R"). A finite internal linear 
map is one which has finite sup norm or equivalently finite scalars in the 
matrices representing it with respect to standard bases or equivalently one 
which maps finite vectors to finite vectors. 


2.10. THEOREM. Let f: U—R" be a standard function defined on U CR". 
The following are equivalent: 

(i) There is a standard map Df : U > Lin(R", R") such that whenever x is 
near a standard point of °U, and whenever 6z is infinitesimal in *R™, 


f(x + 8z)— f(x) = Df. (8z) +] 8z|- 0 


for some infinitesimal n € *R". 
(ii) For each standard a € °U, there is a finite internal linear map L, such 
that whenever x ~ y ~ a, 


f(y)- f(x) = Lay —x)+ly—x]-0 
for some infinitesimal n € *R". 


(iii) U is open and f is continuously differentiable on U. 


Proor. Modify the proof of Theorem 2.1. 0 


Higher order derivatives are given infinitesimally by the following 
uniform conditions. 


2.11. TAYLOR’S SMALL OH FORMULA. Let f: U—>R" be a standard map on 
U CR™. The following are equivalent: 

(i) There exist standard maps L": US Lin"(R”;R"), the symmetric 
h-linear maps from R™ to R" such that whenever x is near a standard point in 
*U and 52 is infinitesimal in *R™, 


k 
f(x + 6z) = a L'(6z)™+|8z|* -7 


for some 7 infinitesimal in *R". 
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(ii) U is open and f € C*(U;F). 
The maps L" are h-th-order derivatives of f, D"f. 


Proor. See StROYAN [1976], §5.7.9. 


3. Continuous curvature and differentials 


Article 3 in Gauss [1827], the magnificent Disquisitiones generales circa 
superficies curvas, begins as follows. 

‘‘A curved surface is said to possess continuous curvature at one of its 
points A, if the directions of all straight lines drawn from A to points of the 
surface at an infinitely small distance from A are deflected infinitely little 
from one and the same plane passing through A. This plane is said to touch 
the surface at the point A”’. 

This motivates the following definition. 


3.1. DeFinirion. A subset MC R" is an m-manifold with continuous curva - 
ture provided that there is a standard map T from the points of M into the 
affine m-dimensional planes of R” satisfying: 

(i) If A EM, then T, contains A. 

(ii) For each near-standard A € *M, the orthogonal projection from *M 
to T, maps the set of B € *M with B ~ A onto the set of b ET, satisfying 
b=A. 

(iii) For each near-standard A €*M and BE*M with B=A, if 
B-Az=t+n where ¢ lies in Ta, and n is normal to Ta, then 
[n|/|B-A|=0. 


Condition (i) says a preferred m-plane meets M at A. Condition (ii) says 
M infinitesimally has dimension at least m, and (ili) says the deflection angle 
from the tangential component is infinitesimal. It is important that we treat 
all the near-standard points of *M alike, otherwise we only get “differenti- 
able’ and not “C'’’. Notice that (iii) forbids improperly embedded 
manifolds as well as kinks (see Fig. 5). 


3.2. THEOREM. If M is an m-manifold in R” with continuous curvature in 
the sense of Gauss, then M is a C'-m-manifold in the sense of Weyl with 
charts given by local projection onto the tangent planes. 
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Fig. 5. 


A “C'-manifold in the sense of Weyl’’ means the conventional abstract 
manifold definition, M is covered by an atlas of charts where each chart isa 
homeomorphism ¢ from a subset of M onto an m-dimensional ball. The 
charts satisfy the overlap condition that when yow'' is defined for two 
charts, it is a C' map from R™ to R™. 


Proor. First we show that when A and B are near-standard on *M, with 
A ~ B, then Tx is nearly parallel to Ts (or equivalently since A ~ B, the 
tangent planes have the same standard part, that is, there is a standard 
plane P so st(T,)=st(Ts) = P). Let 6 =|B-—A|#0. The unit vectors, 
st((B — A)/&) = —st((A — B)/5) lie in the intersection st(T,)M st(T,). By 
(iii) we have 


B-A_ ma, tata 
6 6 6 6°’ 
normal + tangent in T,, and 
A-B_te, ta te 
5 6 6&6 6°’ 
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normal + tangent in Ty, hence, t,/6 + tg/5 ~ — 1 and the projection of one 
on the other has the same standard part. It simplifies our notation to write 
B-A~°*t, for (B — A)/5 = t,/6, etc. By the onto assumption (ii) there 
are points C,D in *M such that §=|C-—A|=|D-—B| and the angle 
between C—A and D-B is the angle of T, with Ts. Also, by (iii) 
C-A~*te in T, and D-B='*tp in Ts. Now, C-A+tA-B tet 
tp ~°tc-t, ~°C-B so that T, is nearly parallel to Ts, since the 
angle-forming vectors lie 6-nearly in both T, and Tz: 


0=*(C-A)-(D-B). 


Second, the orthogonal projection from *M onto T, is one-to-one on the 
infinitesimal neighborhood of a near-standard vector on *M. Suppose 
B-A=T+M and C—A=T+WN where M and N are normal to Tg. 
C-B=C-A+A-B=N-M., yet lies nearly in Tz by (iti) and nearly 
in T, since T, is nearly parallel to Ts. Hence M =N. 

Consider the property P(7): “orthogonal projection from the set of B 
on *M such that |B — A|< 7 is a bijection onto its image in T,’’. By the 
above remarks, P(7) holds for every positive infinitesimal. Cauchy’s 
Principle shows that projection is a local bijection. 

Take two near-standard points A,B on *M with projections Pa,Pp 
which are bijective on neighborhoods overlapping at the near-standard 
point C. The maps P, and Ps defined by neglecting the normal component 
of the displacement vectors are definable everywhere in *R" and in 
particular, on Tc. We consider the “reflection” of T, on Tz, thru Te as. 
shown schematically in Fig. 6, = Ps °(Pa'| Tc). 
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This is a linear map in the displacement from P,(C) to the displacement 
from Ps(C). The actual overlap map is ¢ = Pp ° P;’, that is, project down 
to M from T, and back to.Tg, as shown schematically in Fig. 7. 

Without essential restriction we may assume that Ta, Ts, and T¢ all meet 
at acute angles. 


Fig. 7. 


We take c = P4(C) and let x ~c on Ta. Also, let X = P4'(x). Pa is 
continuous so that X ~ C and we may apply Gauss’ condition to see that 
g(x) and (x) differ by an infinitesimal multiple of |X — C|. This only 
requires a little trigonometry comparing the difference between reflection 
from Tc and M which we leave to the reader. This is a proof of the uniform 
differentiability of g at C. 

Applying Theorem 2.10 we see that the overlap maps are C'. 0 


Let A be near-standard on *M and 0< 6 =O. We shall write 
E~°F provided [EaF is infinitesimal 


and say E is “‘~°-nearly equal to” F or E differs ‘‘8-infinitesimally” from 
F, Gauss’ condition means that the set 


5M = {B Ee *M: BoA is finite} 


is ~°-nearly on ©-vector space with A as the zero. More specifically, linear 
combinations in *R" of displacement vectors with finite scalars project 
back into 5M, with at most an error which is infinitesimal when divided by 
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6. Another way to view this is that 5M, only differs 5-infinitesimally from 
its projection on T, and that is a vector space. This provides the link 
between infinitesimals on a fixed scale and differential forms. If { is a 
standard C'-covector valued map on M, then 5 - {, + A can be thought of 
as nearly in 5Ma, inverting the projection from JT (and using the ambient 
inner product for a vector-covector isomorphism). On the other hand, if 
x :*M— *R” is an internal finite map such 6-x(A)€ SM, and x has a 
finite linear approximation in the sense of 2.10, we may interpet 


dx 


as the ~*-equivalence class of the map. The usage of infinitesimals in local 
geometry usually does not require a function, but rather only a single 
=*-class of perturbations. Picard’s extistence theorem for differential 
equations would let us localize such an infinitesimal condition. We regret 
lacking the space to do this in greater detail, the reader can decide which 
way Gauss intended. 

The exterior algebra of the space 5M, can now be constructed identify- 
ing near-parallelograms of A, B,C and A, D,E where |B-—A|,|C- Al, 
|D-—A| and |E-—A| are all 6-finite provided {B— A,C- A} and 
{D — A, E — A} span the same subspace of Ta and have nearly the same 
area after division by 5° (and so on, up to m-forms). We call such a pair a 
5-area element when the area divided by 57 is noninfinitesimal. 

We may interpret the 5°-equivalence class of an internal assignment of 
such 8-area elements, 


ay 


as a 2-form (and so on, up to m-forms. A complete local account of forms is 
given in StRoYAN [1976], Ch. 5). 


4. Kissing curves 


We shall begin our study of curves in R* with a construction of the 
osculating circle, the circle which exactly kisses a curve at a point on the 
curve. We rely on the Gaussian Definition 3.1 of a curve [ where m = 1. 
We may parametrize the curve locally with the length u along a tangent 
line (3.2) and unit tangents are approximated by 


B-A 


tA BA] 
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whenever B ~ A on I. Rosinson [1966] gives an infinitesimal account of 
arc length when I is given by a regular smooth parametric function X(t) 
In that case, 


dx (Ate x0) 
st (—— 


a nee a 1c 
a ds ~ P (A 20 ) 
dt ot 


when A is standard, A = X(t), B = X(t + 5t)= A. With length along the 
tangent line as parameter | X(u)— A|=(u)(1+ «), by (3.1), so 


du) 4y_ dX (4) 9X (Aye 
q(AY=1 and GU (A)= So(A)= To. 


Now consider the normal displacement n of a point infinitesimally far 
from A in the case where X is twice continuously differentiable, n = 
[X(5)-A-6T,]. To begin with, by Taylor's formula n= 
56°(d’X/du’)(A)+ 5°-9 with ~0, and = (d/du)(dX/du)(A) = 
dT/ds(A). We let x, = dT/ds(A) assume xk, # 0, so T, and n determine 
a plane P thru A. We let D be the intersection of the line parallel to n thru 
A and the normal bisector of AB in the plane P, B = X(6). 


B 5T, 


Fig. 8. 


The triangles of Fig. 8 are similar so that 


[D- A] _|B-A| 
[B-Al~ 2[nl 


and |[B-A|=6(1+e), 


so 
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dT 
as A) 


Ta) 


n 1 


In| [eta] |S 


8? 
_ = 2 “ 


Moreover, the circle thru A and B with center D has a circular standard 
part, the standard circle tangent to the curve with radius 1/|«,|, pointing 
in the direction of «4. This clearly does not depend on the choice of 6, in 
fact, C, is the standard part of the infinitesimal tube 


{Y:]¥-D|-|«a]~ 1} 


The circle thru any three points F, B, E near A on the curve lies inside this 
tube as we now demonstrate. Suppose B lies between F and E and 
|E-B|>|F-B|. By Taylor’s formula, 


e? 
E- B=eT, + 5 kn + en, jE B]~ n ~0, 
and 
2 
_Ppe- g_ 2 Gs ou 
B-F PTs +75 Ka + 9°, [F-B] 1; 20, 
so 
1[ B-F ~ Gels) mae * ele) 
[eeaiceril a Ce 


This proves that the plane of FBE is nearly the osculating plane. By (3.1) 
the angle between B-F and E-B is infinitesimal, thus the normal bisection 
plane of EB is nearly perpendicular to T; ~ T, and it suffices to show that 
the radius r of the circle through F, B, and E is nearly 1/|«,|. By simple 
trigonometry and the fact that the angle between B-F and E-B is 
infinitesimal we see that 


anne Ae 


[E- FI" | (g_ B)+=(F- B)| 


and using the Taylor expansions above, r ~ 1/|ks|~ 1/|«x,|. (The reader 
may wish to supply the details. It is quite easy to show that the circle thru 
three equally spaced points is nearly Cy.) 

So far we have shown: (dT/ds)(A) = ka, the curvature vector, is normal 
to the unit tangent T,. We let Na = x4/|«,| denote the principal normal. 

(1/|xa|)Na points from A to the center of the osculating circle. The 
standard part of the circle thru any three points infinitesimally near A 
equals the osculating circle. 
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The curvature vector « can change in two ways as we move along the 
curve. First, by increasing or decreasing in magnitude and second by 
moving from the plane of the starting tangent and curvature vectors. 
Taking four equally spaced infinitesimally nearby points A = X(0), C= 
X(85), D = X(265), E =(38), assuming X(s)€C° and using Taylor’s 
formula, we find that the change in curvature from C to D is [((E — D)- 
(D - C)]-[(D —- C)-(C— A)] = 8°: a’ X/ds’ + 5°- (see Fig. 9). 


Fig. 9. 


The reader should notice that it greatly simplifies the notation in the 
calculation if one writes 5°-o0 to mean ‘d°® times some infinitesimal 
vector’, so 


2 3 
X(26)=A +267 + 2S) , 2S) ae, 8°-0, 


6 
x(aye avs Tee 4 Meg 0 
2 6 ds : 


etc. This is the infinitesimal form of the Landau “small oh” calculus and 
our motivation for the name of the infinitesimals. 
The change in the unit normal from C to D on (4.2) is 
((E-D)-(D-C)]_ [(D-C)-(C-A)]_ ,. 1 dk ; 
87]«| 8] «| elds 

We complete our standard moving frame of orthogonal unit vectors T, 
N =x/|x| with the binormal B = T x N. The change in the unit normal 
above lies nearly in the plane of T and B, it is nearly normal to N, so we 
decompose it into (T, B)-coordinates 0 = d/ds(k - T) = (dk/ds): T+ |x|, 
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so (dx/ds)- T = —|«|?. We denote the remaining part by 7 =|7| B; for 
torsion. 

T(s + 6)= T(s)+6|«|N+6 0, by Taylor’s formula. N(s+6)= 
N(s)- 6|«|T+6|7|B+6- 0, by the calculations above for the change in 
the normal and the T-component. B(s + 6)= B(s)—6|t|N+6-0, 
simply by a ‘“‘small oh” calculation with the above formulas in B(s + 6) = 
T(s + 8) X N(s + 8). Summarizing, mod=* we have the Serret-Frenet 
formulas for the moving frame: 


dT = |x | Nds, 
dN = —|x|Tds +|7|Bds, 
dB = —|r| Nds. 


The sphere thru the four points A, C, D, E contains all this information 
geometrically. Its standard part is called the osculating sphere. The plane of 
T and N cuts this sphere in the previously computed osculating circle. The 
center of the sphere lies above the center of the osculating circle along the 
vector 


re) 
1 iy ( K 
——. B, 
|r] ds 

where the two coefficients incorporate the twist in the plane of T, N and 
the straightening of the radius of curvature, respectively. We leave the 
calculation to the reader. Note: 

ax _ 

ds* 


je PTH UAL Ns [alr] B. 
ds 


The formulas for the moving frame take a simple form if one computes 
the axis of rotation in moving the frame at C to the frame at D. One 
obtains 


dT = Rx Tds, dN = RXWNds, dB=RxBds, 


where R =|t| T+ |x| B. We also leave this calculation to the reader. 


5. Gauss’ investigations of surfaces 


In this section we give a brief account of the role of infinite and 
infinitesimal vectors in Gauss [1827]. 
Surfaces embedded in E* have tangent bundles with simpler descriptions 
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than abstract manifolds. We gain enough in intuition and technical ease in 
such a setting that an apology for lack of generality would be unnecessary 
even if Gauss’ results were not as profound as they are. 

In his Abstract of the Disquisitiones generales circa superficies curvas, 
Gauss states the following. ‘‘In researches in which an infinity of directions 
of straight lines in space is concerned, it is advantageous to represent these 
directions by means of those points upon a fixed sphere, which are the end 
points of the radii drawn parallel to the lines. The center and the radius of 
this auxiliary sphere are here quite arbitrary. The radius may be taken 
equal to unity. This procedure agrees fundamentally with that which is 
constantly employed in astronomy, where all directions are referred to a 
fictitious celestial sphere of infinite radius’. Naturally, infinitesimal 
analysis could rigorously provide us with a celestial sphere of infinite radius 
which we could use to trace Gauss’ investigations. But as Gauss himself 
points out, we may as well normalize after we map the various things on 
that sphere anyway. 

In Article 4, Gauss lets L be a point on the auxiliary sphere correspond- 
ing to a unit normal to T, and also takes a point infinitesimally near A 
(8-infinitely far from A) with coordinates x + 5x, y + dy, z + 6z where A 
has coordinates x, y, z. If L has coordinates X, Y, Z, then Gauss’ tangency 
condition (in Definition 3.1) directly implies 


(X, Y, Z)« (8x, Sy, 8z)~* 0 
or 
Xdx + Ydy+Zdz =0, 


after factorization mod =*. The remainder of Article 4 is devoted to 
deriving the form of this equation, first, when M is given as a null-set of a 
function W, M = {x: W(x) = 0}, second, when M is given by two coordinate 
maps, and third, when z is given as a function of x and y. 

In the first case, we take dW for the equivalence class mod = 


W(x + 8x, y + dy, z + 6z)— W(x, y,z) = dW. 


® of 


Naturally, 
dW = Pdx + Qdy + Rdz 
where 
_ aw _ aw _ aw 
BS ge. OF dy” ~ 82 


by Theorem 2.10, provided W is uniformly differentiable, so 
P cos(1)A + Q cos(2)A + R cos(3)A ~° 0, 
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since W does not change on M. This is true for each direction A arising 
from a (6-finite) infinitesimal perturbation of A on M (by the onto 
assumption), thus we obtain Gauss’ formula for the normal coordinates 
serene estan Bilas te MO Ne Nae Rt 

V(P?+ Q?+ R’)’ V(P?+ Q?+ R’)’ V(P?+ Q?+ R’)’ 
or the negative of each. 

We leave the account of the infinitesimals in the other forms of the 
formula to the reader. 

In Article 5, Gauss discusses the local orientation of the normal vector to 
the surface. We add the following to that discussion. Nearby tangent planes 
are nearly parallel, thus the internal statement, ‘‘a direction of the normal 
line at B forms an acute angle to the direction selected for the normal line 
at A” holds on the infinitesimal neighborhood of A and since that set 1s 
external, out to some finite radius around A. On that set, the acute normal 
at B agrees with the orientation of A’s. In words, consistent local 
orientation also follows directly from Gauss’ condition of continuous curva- 
ture. 

Article 6 introduces the map which is now frequently called the Gauss 
map, namely, over a consistently oriented portion of M we send A » n(A) 
the unit normal on the auxiliary sphere. 

The integral curvature of a portion of M is defined to be the area of its 
image on the auxiliary sphere. Now we take a standard point A on Manda 
6-area element in the sense of the last section. The ratio of the internal 
integral curvature of the volume element divided by the area of the volume 
element (which we can compute as a parallelogram on T, with only 
8-error) is a finite number whose standard part Gauss calls the measure of 
curvature of M at A. This exists and is well defined independent of 6 and 
the particular 5-area element by standard results of infinitesimal calculus. 

The following is an account of Gauss’ Article 7 with some changes in 
notation (and perhaps 6-infinitesimal differences). Three (6-finite) in- 
finitesimally close points on *M, A, B, C span a parallelogram on T, with 
sides AB and AC whose area in 5’-nearly equal to that of the projected 
image on *M. This is a 6-area element on *M in the cases where B — A and 
C-—A have a finite angle between them and are not 6-infinitesimal. We 
assume this to be the case and denote it by d’a (mod 8”). 

The area element d’o is mapped to a figure on the auxiliary sphere 
under the Gauss map n, moreover, n(A + E)~*n(A)+Dn,(E), 
whenever A + E is in 5My, in fact, n(A + E)= n(A)+Dn,(E)+|E| ne 
and the error 7 is bounded by the infinitesimal max[| |: A + E € do]. 


x 
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Thus we replace n(d’o) by d’3, the projection on the auxiliary sphere of 
the parallelogram spanned by n(A),n(B), n(C) on the tangent plane to 
the sphere at n(A). Again, since the sphere has uniformly continuous 
curvature, the area (mod 5”) of d’d can be calculated either on the sphere 
or the tangent plane at n(A). 

Now, the measure of curvature 


_ area(d’s) 
ce) area(d’o) 


and we could use the familiar cross product formula for the respective 
areas on the respective tangent planes. Article 7 is a derivation of the 
formula for k in the case where M is given in Gauss’ third form, z = f(x, y). 
In this case-T, is not perpendicular to the xy-plane and thus 


area(d’2) _ area(projection of d’ on xy-plane) (5.1) 


area(d’a0) _area(projection of d’o on xy-plane) 
We denote the coordinates of the xy-projections: 
A: x ,y n(A): X 52¥. 
B: x+6x,y+8y n(B): X+6,X, Y+6Y 
C: xt+ 6.x, y+ d&y n(C): X+6X, Y+ &Y. 
The areas of the parallelograms are then 
(5:x)(S:y)— (82x)(B1y), (8X) Y)—(&2X)(HiY), (5.2) 


respectively. 
X and Y are functions of x and y since M is — by z = f(x, y) and 


~ EX 5, 


5X = X(C)- x(a * bx cn 


8,Y = Y(B)- ¥(A)~*S b,x aed &,y, 


8 Y = Y(C)- Y(A)~? oy 2k + Gy By 


dy 


Now substitute this into (5.2) and then into (5.1) to obtain 
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with “equality” rather than only “infinitesimally nearly” since both sides 
are standard. 

The infinitesimal account of the remainder of Article 7 is left to the 
reader. 

In Article 8, Gauss derives the theorem of Euler which says that amongst 
the curves formed by the intersection of M and planes containing the 
normal thru A, those of maximum and minimum curvature are at right 
angles. He also shows that the product of these maximum and minimum 
curvatures is the measure of curvature ‘‘k = TV”. We paraphrase as 
follows. 

Fix a point A on M and select a coordinate system with zero at A, the 
z-axis along the normal to Ts and x°, y° perpendicular coordinates in T,. 
Gauss’ tangency condition means that z vanishes to the first order near A. 
If M is C’, and (x°, y°, z) is a 8-finite vector 


Z=3T(x°P + Ux°y° +5 Vy + 87-0. 
Turning the axes of x and y thru an angle 6 with 


_ 2U° 
tan20@ = pT -ve 


it is easily seen that the new formula for z is 
zZ=3Tx? +3 Vy?+ 8-0. 


(1) If the curved surface is cut by the plane normal to T, containing the 
x- and z-axes, the curvature of the resulting curve is T. The sign of T says 
whether the curve bends above or below Ta. 

(II) In like manner V represents the curvature cut by the normal plane 
thru the y- and z-axes with the same sign convention. 

(IN) Setting x =rcosg and y =rsing, the equation 


z=3(Tcos’g + Vsin’¢)r?+r’-0, 
for infinitesimal r, gives the curvature 
T cos’ y + Vsin’¢ 


to the curve cut on M by the normal plane thru the z-axis making an angle 
gy with the x-axis. 

(IV) Therefore, whenever T = V all curves cut by normal planes thru 
the z-axis have the same curvature. When T and V are not equal but have 
the same sign, one is the maximum and other the minimum of all such 
normal curvatures. On the other hand, one has the greatest convex 
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curvature, the other the greatest concave curvature, if T and V have 


opposite signs. 
(V) The measure of curvature at A takes the very simple form 


k = TV. 


We see this by computing with (x, y) = (6, 0) and (x, y) = (0, 6) so that the 
length of (6,0) on the normalized osculating circle is T6, (0, 5) normalizes 
to V6, so the area of (d’?2) is TVS? mod 5’, while the area of (d’c) is 
6’ mod 8’. (In terms of Dn, 


in (x, y)-coordinates.) This shows Gauss’ result for Article 8: 


5.1. THEOREM. The measure of curvature at a point A of a C’-surface is the 
product of the extreme curvatures of curves cut by normal planes thru n(A). 
These extreme curvatures occur at right angles. 


We believe that this is enough infinitesimal analysis to prepare the 
reader for Gauss’ subsequent uses of infinitesimals — it does leave some 
fascinating reading to test infinitesimals on. 
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1. Introduction 


This chapter has an aim different from that of most of the other chapters 
of this volume. Whereas those give a comprehensive overview of a 
well-defined branch of mathematical logic, the present one aims at 
exposing a remarkable point of contact of several topics in logic each of 
which has an extensive literature. Those who are more theoretically 
minded, interested in the explanatory aspects of science, in contrast to the 
experimentally minded who hunt for new phenomena, would always take 
time to watch how apparently unrelated or vaguely related topics join in a 
coherent unity, even though the new results forthcoming from such a state 
of affairs may not (yet) be spectacular. This chapter is written for such a 
“theoretically minded’’ reader. 

There are four participants of this affair, viz. (i) admissible sets, (ii) the 
model theory of L.,.,, (iii) classical descriptive set theory (cf. Chapter C.8), 
(iv) effective descriptive set theory (hyperarithmetic, IT{, etc. sets) (cf. also 
Chapter C.8). These four aspects of our theme will appear quite explicitly 
in the chapter. There are two further, in this chapter more or less hidden, 
subjects playing important supporting or motivating roles, viz. set theory 
and recursion theory. The theory of inductive definability, a topic that 
recently gained an independent status (cf. Chapter C.7), has important 
connections to our theme but we will not be able to discuss those. 

From the four listed ingredients, the first two will be the ‘‘active’’ ones. 
Our main theme will be the model theory of admissible fragments of L.,.. 
but we will develop it only as far as our results have direct applications to 
the latter two subjects. 

Admissible sets were introduced by Kripke [1964] and PLATEK [1966]. 
Their point of view was primarily that of recursion theory. They general- 
ized ordinary recursion theory of the integers to ordinals smaller than a 
fixed well-behaved, so called admissible ordinal. Originally, admissible 
ordinals were defined in terms of a recursion theory in the style of the 
Herbrand—Godel equation calculus. It was a decisive step to switch from 
admissible ordinals to admissible sets; technically, from the admissible 
ordinal a to the set L, of sets constructible before a. As a result, the 
admissibility condition on a@ gets translated into an elegant first-order 
axiom system KP, talking about the set-theoretic structure (L., © [L.). 

KP is a very good axiom system. It is simple and also, it has a “‘self- 
contained”’ theory. E.g., one does not need to add the axiom “V=L” 
(that, however, is true in the originally conceived intended models) in 
order to have most of the desired consequences. It would indeed be a pity if 
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““V = L” were needed since it is by no means a simple or a ‘‘natural”’ 
axiom. 

The connections of admissible sets to effective descriptive set theory 
(Participant (iv)) were recognized early. The fundamental results that (a) 
the smallest admissible set beyond the trivial one is HYP(w) = Lusx, the set 
of sets constructible before the Church—Kleene (smallest non-recursive) 
ordinal w{* and (b) a Cw is hyperarithmetic iff a € HYP(w), were found 
by Kripke and Platek (cf. 6.5 and 6.6 below). 

For a modern treatment of admissible sets, we refer to BARWISE [1975]. 

The second ingredient is infinitary logic. Infinitary logic essentially 
started out with the Hanf-Tarski work on incompactness (HANF [1964]) 
and it was first systematically investigated by Karp [1964]. From around 
1962, the model theory of L.,,., the tamest infinitary logic, was taken up. 
For an early account, cf. Scott [1965]. A treatment of this theory, in an 
already mature state, is KEIsLer [1971]. 

The connections between descriptive set theory and infinitary logic were 
clearly in sight from the beginning. Scott’s isomorphism theorem (Scotr 
[1964]) was equivalent to an answer to an old question of Kuratowski. 
Lopez-Escobar’s generalization of Craig’s interpolation theorem (cf. also 
below) was a generalization of Suslin’s separation theorem. In general, the 
connection has the following two aspects. First, infinitary logic replaces 
classes of points of a space by classes of models. Second, in logic syntactic 
aspects, only implicit in descriptive set theory, become explicit and give rise 
to interesting considerations. E.g., Malitz’s preservation theorem (MALITz 
[1971]) characterizes sentences of L.,.. preserved to substructures as those 
that are logically equivalent to universal ones. Infinite formulas are lurking 
behind Borel and analytic sets, but one can think of Malitz’s theorem only 
if formulas have been made explicit. As we see it now, Malitz’s theorem 
and the Suslin separation theorem have an intimate relationship in a 
common generalization, cf. MAKKAI [1973a]. 

Vaught has made very important contributions, actually in two opposing 
directions, to establishing and deepening the connections of descriptive set 
theory and logic, cf. WAuGuT [1973] and VauGut [1974]. One aspect of 
VaAuGHT [1973] will be exposed below in detail. In VauGut [1974] a sort of 
opposite point of view is taken and model theory is being ‘“‘eliminated”’ in 
favor of topological methods — with interesting and surprising success. 

The contact (that is the main point of this chapter) between admissible 
sets and L.,,.. (and L...) was made by Barwise [1969]. Barwise introduced 
admissible fragments of L.,. (and of L..), by considering only those 
formulas that belong to a given admissible set. This step was made in the 
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spirit of suggestions of Kreisel who had emphasized that pure cardinality 
considerations (the earlier basis of classification of infinite formulas into 
the languages L.g) were too crude to yield any interesting results. The 
rewards were instant: Barwise’s completeness and %-compactness 
theorems (cf. also below) showed that admissible fragments had properties 
similar to ones of ordinary first-order logic that had long been recognized 
as basic. 

Our point of view of the relationship of admissible sets to logic can be 
summarized as the phenomenon that can be called the syntactic complete - 
ness of admissible sets. Let A be a countable admissible set, L, the 
fragment L.,,. N A. ‘‘Syntactic completeness” is exemplified by the follow- 
ing examples. If g € L, is logically valid, then not only there is a derivation 
d of g in a Gentzen-type, infinitary formal system (by the Completeness 
Theorem for L.,., Karp [1964], Lopez-Escopar [1965]) but in fact, d can 
be chosen in A (Barwise’s Completeness Theorem). As a second example, 
we note that Beth’s Definability Theorem is true for single sentences p(P) 
of L.,.. (Lopez-EscoBar [1965], cf. also below). Again, the formula 
“explicitly defining P’’ can be chosen in L4 once ¢(P) belongs to L(P)a. 
Actually, admissible sets are essentially the optimal solutions to the 
problem of finding “syntactically complete” fragments of L.... 

As another remark, let us note that countable admissible sets will not in 
general be ‘“‘semantically complete’’. E.g., a sentence in L, might have no 
model in A, even though it has one. Let us also note that it is easy to 
construct “syntactically and semantically complete’’ countable transitive 
sets (by downward Lo6wenheim-Skolem arguments, starting from HC, the 
set of all hereditarily countable sets) but one does not get but a few 
admissible sets, and not the most interesting ones either, in this way. The 
basic facts of logic on admissible sets lie therefore somewhat deeper than 
certain obvious reflection arguments. 

The gain of adding admissible sets to the model theoretical-descriptive 
set-theoretical point of view consists in bridging the gap between non- 
effective and effective descriptive set theory, a goal that was emphasized 
already before admissible sets by Appison [1962]. To refer to perhaps the 
main instance of the connection, we will elaborate how (a version of) 
Kleene’s theorem stating that hyp = Aj on the one hand and the Suslin 
separation theorem on the other become special cases of a single result, cf. 
Section 8 below. 

The chapter is aimed at the model theorist, i.e. someone who has some 
knowledge and appreciation of the spirit of the model theory of ordinary 
first-order logic. We have chosen to present those methods of the subject 
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that are most model theoretical in spirit. In particular, we will feature two 
recent devices: Vaught’s use of conjunctive game sentences, and Ressayre’s 
2-saturated structures. Let’s emphasize that, as a result, we have not taken 
the most direct routes to some of the basic facts (such as the Barwise 
Completeness Theorem). 

The first two sections (after the present one) are largely descriptive and 
do not contain much in the way of proofs. The main theme is taken up in 
Section 4; the main line of the model theory is essentially self-contained. 
Applications to (effective) descriptive set theory are in all of Sections 5 
to 9. 


2. The Kripke—Platek axiom system 


Urelements 


Although urelements have not been fashionable for some time, they are 
coming back into vogue just now, at least in the context of admissible sets; 
cf. BARwisE [1975]. Axiomatic set theory can be naturally developed with 
due regard to urelements, without much change of the usual exposition of, 
e.g., Zermelo—Fraenkel set theory. 

Urelements are ‘‘points’’ with no set-theoretical structure, i.e., if u is an 
urelement, no object whatever is an element of u, x u always. One can 
describe the Cantorian universe of sets with urelements as follows. Let U 
be any set the elements of which are called urecements. For ordinals a, we 
define Vu. by Vuo=U, Vuaer = P(Vue), Vou = Uaer Vue for limit A, 
and we put Vu = U cor Vue Vu is the universe of sets with support CU. 
The usual unramified hierarchy is obtained with U = 9. The membership 
relation Ey on Vy is naturally defined in such a way that for any u & U, any 
x € Vu, x€uu. Henceforth we write € for Ev. 

In the language of set theory with urelements, we have a predicate U in 
addition to € and equality, with the obvious interpretations in Vu when 
regarded as the structure of all sets with support C U. We will use the bold 
face notation € and U only for emphasis; usually we will write © and U 
even in formulas. 

The standard axioms of Zermelo—Fraenkel set theory with the axiom of 
choice (ZFC) will undergo only obvious changes when formulated as 
assumptions on Vu. E.g., the axiom of extensionality becomes 


(7Ua anUb)> (Vx (x €Caex€ b)—>a=b). 


238 MAKKAI/INFINITARY LOGIC [cu. A.7, §2 


We will use set-theoretic terminology in connection with Vu (‘‘powerset’’, 
“ordinal”, etc.) in the expected sense. 

We refer to BARwisE [1975] for a discussion why it is reasonable to 
consider a formulation of set theory, especially admissible set theory, using 
urelements. We can indicate the reason for our interest in urelements as 
follows. We will consider a special admissible set HYP», the admissible 
hull of a structure Y= (|W, Ri,...,R,). We will have DUE HYPx, in 
particular, | 2|€@ HYP». We want HYPy to have model-theoretical signifi- 
cance and for that, we will naturally need that any isomorphism f : Dt= MN 
can be extended to an € -isomorphism of HYPy, and HYP». Clearly, for 
this we need that elements of | Jt| do not have “‘set-theoretical individual- 
ity” in HYPyy, i-e., that they be urelements in HYP». 

Apart from HYP, there is not much reference to urelements in this 
chapter. 

The introduction of urelements into admissible set theory, HYPx and 
related notions are all due to Barwise, cf. BARwisE [1975]. With minor 
modifications, this section is based on the same source. 


The Kripke-Platek axiom system 


2.1. Derinition. The collection of Ay formulas (of the language { € , U}) is 
the smallest collection Y containing the atomic formulas and closed under 
the conditions: 

(i) If g is in Y, so is Tg. 

(ii) If g and & are in Y, so aye paw and gvy. 

(iii) If g isin Y, so are Vx © y (gy) and Ax € y (9). 

Note that Vx € y (¢) means the same as Vx [x € y > gp] and 3x Ey (¢) 
as dx [xE yang]. 

Given a transitive set A (i.e. x € A and y © x imply that y € A), we 
can consider A as a structure for the language { € , U} by interpreting € by 
the real € restricted to A and U by UNA. Mostly, we will write simply A 
to denote the structure (A, € | A, UNA). 

Now, the main property of Ay formulas is that they are absolute for 
transitive interpretations in the sense of: 


2.2. Proposition. For transitive sets A, BCVu, if A CB, then for any 
Av- formula ¢ (x) and any elements ain A, we have AF ¢[a] iff BE [a] iff 
Vu FE g[a]. 


Proof. This is an easy induction on the complexity of g O 
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Many elementary concepts of set theory can be formalized with Ao- 
formulas (e.g., ‘“‘f is a function”, ‘‘a is an ordinal’’ etc.). The significance of 
this fact is that it doesn’t matter if we evaluate the defining formulas in Vu 
or in any transitive set containing the object in question. 


2.3. DEFINITION. KPU, the Kripke—Platek axiom system with urelements, 
is the theory over the language {€,U}, axiomatized by the universal 
closures of the following axioms: 

Urelements: U(u) > x & u. 

Empty set: Ax( U(x) Wy(y€ x)). 

Extensionality: see above. 

Foundation (schema): Vx (Wy © xo(y)— ¢(x)) > Vx ¢(x) for all for- 
mulas g(x). 

Pair: Ja(xE any €a). 

Union: 3bWy EaVx Ey (x € db). 

Ay-Separation (schema): 3bWx(xE box Eang(x)) for all Ar 
formulas g (in which b does not occur free). 

Ao-Collection (schema): Wx € a Ay p(x, y) > 3b Vx Ea Ay € be(x,y) 
for all Ao-formulas g (in which b does not occur free). 

(gy, # above may have free variables other than the ones indicated.) 


2.4. Derinition. An admissible set is a transitive set A (in some V\) that is 
a model of KPU. 


Of course, extensionality and foundation are automatic for transitive 
sets. These axioms appear in KPU because one considers ‘“‘non-standard”’ 
models of KPU as well, cf. below. 

We are going to state some important general derived principles next. 


2.5. DeFinition. A formula of the form Jug(u) where ¢ is Ao, is called a 
2,-formula. The class of 2-formulas is the smallest class Y containing the 
Ao-formulas and closed under 

(a) conjunction and disjunction (condition (ii) of 2.1), 

(b) bounded quantification (condition (iii) of 2.1), 

(c) existential quantification: if g is in Y, so is dug. 

The notion of II-formula is obtained by replacing 3 by V in the last 
clause. Up to logical equivalence, I-formulas are exactly the negations of 
2-formulas. 

The following is the basic (upward ) persistence property of 2-formulas. 
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2.6. Proposition. Let A CB transitive sets C Vu, g(x) a 2-formula, a €& 
A. Then A F o[a] implies BF p[a] and VuF [a]. 


Proor. This is an easy induction on g. O 


Given a formula g and a variable w, we write yg‘ for the result of 
replacing each unbounded quantifier in g by a quantifier bounded by w: 
au by Ju € w, Vu by Vu € w. w should not occur in ¢. If ¢ is Ao, then 
g” = @. It is practically obvious that for 2-formulas (pa u C v)> o™ 
and g“’—> @ are logically valid. 


2.7. PRoposiTION (2-reflection principle). For all X-formulas ¢ the following 
is a theorem of KPU: 


¢ dae (a). 


In particular, every %-formula is equivalent to a %,-formula in KPU. 


Proor. The proof is by induction on g. Notice that Ao-collection is a special 
case of 2-reflection. Conversely, A,-collection is used in the proof in the 
induction step: g is VuEvw(u). O 


An easy consequence of 2.7 is 2-collection, that is, like Ao-collection with 
g an arbitrary %-formula. 
The next definition is fundamental. 


2.8. DEFINITION. Let A be an admissible set. 

(i) A predicate on A is & (light face sigma) (or, 4) if it is defined by a 
2-formula (in the structure A = (A, € | A, UNM A), without parameters). 

(ii) A predicate P(x) on A is & (bold face sigma) if it is defined by a 
¥-formula with parameters in A, i.e., there is a £-formula ¢ (x, y) and fixed 
elements b in A such that for any a in A, P(a)@ AF ¢f[a, B]. 

(iii) An operation on A is 2, or %, according to its graph. 

(iv) A predicate PC A" is A (or A), if both P and its complement 
A" — P are > (or 2). (A" — P being & or & is the same as P being II, or II). 


For orientation, at this point the reader should consider the admissible 
set HF of hereditarily finite sets without urelements and he should see (or 
at least believe) that the = (also, the &) predicates on HF (restricted to w) 
are exactly the recursively enumerable ones, and hence A = A = recursive. 
This example is fundamental in so far much we look for on admissible sets 
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is motivated by a desire to find analogs of facts of ordinary recursion 
theory, generalized for 2 from r.e., and A from recursive. 

Emphasizing this analogy, sometimes we say that a is A-finite if a € A, 
a predicate P is A-recursively enumerable (A-r.e.) if it is 2, A-recursive 
(A-rec.) if it is A, and an operation F is A-recursive if it is Za. 


2.9. Proposition (A-separation principle). For any %-formula p(x) and 
II-formula w(x), perhaps containing other free variables besides x, the 
following is a theorem of KPU: If for all x € a, p(x) < W(x), then there is b 
such that b = {x € a: y(x)}. 


Proor. Assume Wx Ea(g(x)eu(x)) Then Vx €a(o(x)v7y(x)), 
which is equivalent to a 2-formula, so by 2-reflection, there is a c such that 

(i) Wx E€al[e(x) v Th (x)]. Let, by Ao-separation, b={x Ea: 
gy (x)}. Clearly, every x € b satisfies g(x). If x € a and ¢(x), then (x), 
so w(x) (since w is I). So by (i), g(x). Thus x € b which shows that b is 
as desired. O 


Notice that the careful formulation of 2.9 has a direct consequence: 


2.10. CoroLLary (A,4-separation). For A admissible, fora € A and for A 
predicate P(x) on A, {x € a: P(x)} is an element of A. 


2.11. CoroLvary (2-replacement). For each %-formula o (x, y) the follow- 
ing is a theorem of KPU: If Vx € a d!ly g(x, y), then there is a function f, 
dom(f) = a, such that Wx € ag (x, f(x)). 


ProoF. By 2-collection, there is b such that Vx € a Jy € be (x, y). Using 
A-separation, there is an f such that 


f={(x y)E ax b: e(x, y)} 
={(x,y)€ aX b: Az [p(x,z)ay#z]}. O 


2.12. CoROLLARY (2%,-replacement). For A admissible, fora € A and fora 
Xa-function F on A, the restriction Ft a is an element of A. 


We have “‘formal”’ analogs of the notions in 2.8. A 2-operation symbol of 
KPU is an operation symbol F introduced by a new axiom Vx Vy (F(x) = 
y<o(x,y)) with a %-formula g such that KPUFVx J!yo(x,y). A 
A-predicate symbol of KPU is a predicate symbol P introduced by a new 
axiom Vx (Px < ¢(x)) such that ¢ is = and there is a -formula #(x) with 
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KPU! Vx(o(x)< &(x)). Of course, the “inessential’’ extension of KPU by 
such symbols is a conservative extension of KPU, but more is true. If we 
allow them to enter atomic formulas and we redefine A,- and 2-formulas 
accordingly, then we obtain formulas that are provably equivalent to A- 
(i.e., both to a £- and a [-formula), respectively, to X-formulas in the old 
sense. Moreover, all the new axioms and derived rules of KPU obtained 
with the new notions of As, A and &% become provable. E.g., for Ao- 
separation for the new setting, this fact involves the A-separation principle 
(Proposition 2.9) for the old KPU. 

These facts are related to the notion of relative admissibility. If S is a 
finite list of predicates and operations on a transitive set A, then A is said 
to be admissible relative to S if all axioms of KPU, with elements of S 
appearing in atomic formulas, are true in (A, € [A,S). We have 


2.13. THEoREM. If A is admissible, $ is a list of A,-predicates and 
%.-operations on A, then A is admissible relative to $. Moreover, any notion 
that is & or A relative to S, is already Z,, or Aa, respectively. 


Finally, we state an absoluteness property of A. 


2.14. Proposition. Let P(x) be a A-predicate symbol of KPU, F(x) a 
2-operation symbol of KPU, and A C BCV¢u admissible sets. Then for a, 
elements of A, AF Pla] @ BE P[a] @ Vuk Pla]. Also, A is closed 
under the operation F‘ (F interpreted in Vu) and in fact, F‘» restricted to A 
is F“, F interpreted in A. 


-recursion 


Next we state the most important derived principle, 2-recursion. TC(x) 
means the transitive closure of the set x, i.e., the smallest transitive set b 
such that a Cb. For an urelement p, TC(p)=9. 

Many transfinite-type inductions can be construed as “inductions on 
TC(x)”, i.e., when the induction hypothesis is that the assertion is true for 
all y € TC(x). Ordinary transfinite induction falls into this category, as well 
as what is called induction on infinitary formulas. We will formulate forms 
of proof and definition by induction ‘‘on TC(x)” that are provable in KPU. 

We omit the proof of the existence of a 2-operation symbol TC(-) in 
KPU such that-it is a theorem of KPU that TC(a) is transitive, a CTC(a), 
and for any transitive b, a Cb implies TC(a) Cb (cf. BARwisE [1975]). It is 
quite easy to show: 
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2.15. PRoposiTion (TC-induction principle). For any formula p(x) (with 
possible other free variables as well), the following is a theorem of KPU: If 
for each x, Vy © TC(x) p(y) implies p(x), then Vx (x). 


2.16. Proposition (Definition by %-recursion). Let G be an n+2-ary 
X-function symbol, n 20. It is possible to define a new X- function symbol F 
so that the following is a theorem of KPU: For all x = x,,...,%n and y, 
F(x, y) = G(x, y, (AzF(x, z)) I TC(y)). 


ProoF. The proof is exactly like the proof of the recursion principle for ZF, 
cf. 5.6 in Chapter B.1. D 


2.17. Proposition (Definition by 24 -recursion). Let A be an admissible set. 
Let G be ann + 2-ary & (respectively, %) function on A, n =0. There isa & 
(respectively, %) function F on A such that F satisfies the identity in 2.16. 


Although 2.17 is not a consequence of 2.16, the latter’s proof is easily 
modified to give 2.17. 


2.18. CoroLvary (Definition by TC-recursion of A-predicates). Let P, Q be 
A-predicate symbols of n+ 1, n +2 arguments, respectively; n =0. We can 
introduce a A-predicate symbol R so that the following are provable in KPU: 


R(x, p)< P(x,p) for urelements p, 
R(x, a) Q(x, a, {b € TC(a): R(x, a)}). 


There is an obvious variant of 2.18, call it Corollary 2.19, on A-predicates 
on admissible sets. 

There are many familiar set-theoretical operations and predicates that 
can be introduced in KPU as 2-operations and A-predicates, using 2.16 and 
2.18. Next we will see important examples of such. 


Syntax and semantics of L., in KPU 


For basic syntactic and semantic notions of the logic L.., we refer to 
Chapter A.2. It is important for us that we construe formulas as sets. 
Symbols in the basic language L may be urelements but we construe the 
formulas 4x, V @, etc. as the sets (4,x,¢), (V,@), etc., with 3, V 
being some such fixed sets as 2 and 3. In such a set-up, proper subformulas 
of a formula ¢ are elements of TC(¢) which fact makes it possible to give a 
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definition by TC-induction of the notion ‘“‘g is a formula of the ~, w-logic 
based on L’’. In fact, by 2.18 we obtain that this notion is a A-predicate in 
KPU, more precisely, that there is a A-predicate symbol in KPU whose 
interpretation on Vy is that in quotes. 

The last fact has an important consequence for admissible fragments of 
L... Let A be an admissible set and let LE A. We denote by L, the 
collection of formulas of L.... that are in A, L, = L... N A, and we call La 
an admissible fragment (of L..). By the absoluteness of A-predicates in 
KPU, 2.14, the A-definition of L.. when “relativized” to A, gives a 
definition of L4, hence the notion of formula in L, is A on A. Similarly, it 
turns out that the usual syntactic properties and operations on formulas of 
L.., are A in KPU, hence the corresponding notions for L\, are A on A. 

Let A be admissible and a bit more generally than before, let the 
language L be a A, -subset of A. Similarly as before, we can again see that 
the notion of formula in L, = L..,M A and other syntactic notions of L, 
are A on A. Notice that now for any g © Ls there is an A-finite Lp CL 
such that  € (Lo)a. 

Returning to the “‘formal-in-KPU” context, we note next that actually 
the semantics of L... is A in KPU as well. In other words, there is a 
A-predicate Sat(L, 2, a,g~) of KPU expressing ‘a is a finite sequence of 
elements of the L-structure 2? and ME ¢y(a)”, i.e., Sat when interpreted in 
Vu coincides with the predicate in quotes. Moreover, Sat satisfies, provably 
in KPU, the inductive clauses of the usual truth definition for L... These 
facts are simply a special case of 2.18 again. Although these facts are very 
important, their importance is limited by the circumstance that in discuss- 
ing an admissible fragment L, we usually cannot restrict our attention to 
models in A itself. At any rate, it follows from the above that truth of 
L-formulas in an A-finite structure is a A-predicate on A, if A is 
admissible. 


3. Examples of admissible sets. The truncation lemma 


Some admissible sets 

We made the remark earlier that the hereditarily finite sets, in any Vu, 
form an admissible set. Generalizing this remark, let x be any infinite 
cardinal and define Hu(x) = {a € Vu| TC(a) has cardinality less than «}. 
H(k) is Hgo{x). 


3.1. THEOREM. For all infinite cardinals x, Hu(x) is admissible. If « is 
regular, then Hu(x) is admissible relative to any predicates on it. 
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Proor The proof will follow from results below. O 


The reader will see that the case of regular « is practically trivial. A 
regular « has the added advantage that if X CHu(«) and card(X) < x, then 
X € Ho(«). 

For k =N,, Hu(&,) is denoted by HCu, H(N,) by HC, the set of 
hereditarily countable sets. The syntax of L.,.. is entirely within HC. 

Let us write 22 <, 9 to denote that YM is an elementary substructure of N 
with respect to 2-formulas, i.e., PE g[a] & RE —g[a] for any 2-formula 
g(x) and elements a in ||. When writing A <, B for transitive sets, or 
classes, A, B we of course mean the corresponding structures with ‘‘real’’ 
€ and “urelements”’. The following is a classical result of Lévy [1965]. 


3.2. LEvy’s ABSOLUTENESS THEOREM. Hyu{«)<:Hu(A)<: Vu for uncount- 
able cardinals x <A. 


3.3. THEOREM. If A is a transitive set, B is an admissible set or class, and 
A <,B, then A is admissible as well. 


Now we see that the first assertion in 3.1 is a consequence of 3.2 and 3.3. 


3.4. THEorEM. Let « be an infinite cardinal. For any a © Hu(k*), there is a 
transitive A GC Hu(k*) such that a© A and A < Hulk) (a fortiori, 
A <,Hou(k«) and thus A is admissible). 


Proor. We expand the structure M2 = (Hu(«*), € | Hu(«*), UM Hu(«*)) 
with « new operations (f,:@<«) such that for any 04% a € Hu(k’*), 
TC(a) = {f. (a): a < x}. This is clearly possible. Now, we apply the Down- 
ward Léwenheim-Skolem Theorem (cf. Chapter A.2) to get an elementary 
substructure of power « of (M, f.)a<. containing a. The underlying set of 
the new structure will satisfy the requirements. [ 


The last result gives us many countable (and other) admissible sets — but 
there are many left that are not obtained in this way. As we will point out 
(cf. 6.3 below) the admissible sets given by the next theorem fall into the 
latter category. 


3.5. PROPOSITION (Existence of the next admissible set). For any seta € Vu, 
HYP(a) =a 11 {A:a €A, A is admissible} is itself admissible. HYP(a) = 
L.(TC(a)) for some ordinal a. Card HYP(a) = max (No, card TC(a)). 
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Remark. HYP(a) is sometimes denoted by a’. 


Proor The proof (and in fact, the statement of the second assertion) 
requires considering the constructible hierarchy, cf. BARwisE [1975]. O 


Let now YW be a structure of finite similarity type, = 
(]W], Ri,..., R,). Regarding the elements of || as urelements, let us 
place ourselves within Vy. Inside Vy, we form the next admissible set 
HYP(M) = HYP ((| Mt], Ri,...,.R,)), and we call it HYPy, the admissible 
hull of Yt. Note that technically, HYP(2) and HYP, may differ; e.g. the 
first may not contain urelements at all. Also, it is easy to see that for any 
isomorphism f : Y= Nt, we get an extension f* : HYPy: = HYP» which is 
an {€, U} isomorphism. 

As a consequence of a definability theorem of BARwisE [1975], Theorem 
5.14, Chapter 2, we have: 


3.6. THEOREM. On A-=HYPy, 2 = with parameters in |M|U 
{]W, Re... R,}. 


An ordinal @ is called admissible if L., the set of sets constructible 
before a, is admissible. In the recursion theory on admissible ordinals, 
various special kinds of admissible ordinals (such as recursively inaccessible, 
projectible, etc.) play important roles. An ordinal @ is called stable if 
L,. <, L. Obviously, every stable ordinal is admissible. The same argument 
as in 3.4 gives us many countable stable ordinals. In Section 6 we will point 
out (without proof) the significance of the first stable ordinal >w for 
effective descriptive set theory. For more on admissible ordinals, we refer 
to BARwISsE [1975]. 


The truncation lemma 


Let A =(|A], E, U) be an arbitrary structure of the language {e, U} and 
assume A satisfies the (modified) axiom of extensionality: 


VxVy [(4Ux a 1Uy AVWz (zEx @ zEy))> x = y] 
AWxVy [Ux > 4 yEx]. 
We call A an extensional structure. For any transitive set or class A C Vu, 
(A, € |A,UN A) is, of course, an extensional structure. Let A be an 


extensional structure, and a € |A|. a is said to be well-founded (w.f.)in A 
if there is no infinite “‘descending”’ sequence ao = a, ai, a2,... such that 
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an+1Ea, for n < w. The whole structure is said to be w.f. if every a € |A| 
is. It is a good exercise to check that A is well-founded iff it satisfies the 
“principle of proof by E-induction’: for any subset BC|A\, 
(|A|, E, U, B) satisfies Vx (Vy (yEx > By)— Bx)— Vx Bx. For a w.f. ex- 
tensional A, there is a corresponding principle of definition by E- 
recursion. 

Clearly, the ‘‘standard’” €-structures, based on transitive sets, are 
well-founded. 


3.7. MOSTOWSKI COLLAPSING LEMMA. Every w.f. extensional structure is 
isomorphic to a standard € -structure. The isomorphism is unique once its 
effect on urelements is specified. 


Proor. For A = (|A |, E, U) w.f. extensional, define by E-recursion f(p) = 
p for p€U and f(u)={f(b): bEa} for a€ |A|—U. f is the required 
isomorphism. The uniqueness is proved by E-induction. O 


Incidentally, using 3.7 and the Léwenheim-Skolem theorem, we can 
prove Lévy’s theorem 3.2. 

The essentially unique standard € -structure in 3.7 is called the transitive 
collapse of A. 

Now, let A be an arbitrary extensional structure. Let WF(A) C |A | (the 
well-founded part of A) be the set of all w.f. elements of A. Clearly, 
U* CWF(A) and x € WF(A) and yE“x imply that y € WF(A). With E* : 
restricted to WF(A), WF(A) = (WF(A), E* | WF(A), U%) is a w.f. struc- 
ture and it is extensional. 

Next we introduce the model-theoretic notion of end-extension. Let 
A, B be structures interpreting the binary predicate symbol E, among 
others. B is said to be an end-extension of A if B is an extension of A in 
the ordinary sense and moreover, if a € |A| and bE”a, then bE |B|. 


Examples. (1) For transitive sets, A CB, the €-structure B is an end- 
extension of A. 

(2) For an extensional structure A, A is an end-extension of WF(A ). The 
following generalizes 2.6. We leave the proof as an exercise. 


3.8. Proposition. Let B be an end-extension of A, a elements of | A|. 
(i) For any Ao-formula y(x), AF g[a] & BE [a]. 
(ii) For any 2-formula o(x), AF g[a] > BE [a]. 
(iii) For any 2-formula p(x) and "-formula (x), if both in A and B 
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Vx [¢y(x)<— &(x)] is true (~ defines ‘‘A-predicates’’ in A and B), then 
AF ¢[a] © BFE g[a]. 


Now we turn to studying the well-founded part of a model A = 
(|A|,£,U) of KPU. The set-theoretical rank-function r(-) is a - 
operational symbol in KPU and ‘“‘r(x) is an ordinal” and ‘‘y € x > r(y)< 
r(x)” are provable in KPU. It follows that a € |A | is w.f.in A iff r*(a) is. 

The following is quite a characteristic property of KPU, not shared by 
many stronger set theories. 


3.9. TRUNCATION LEMMA. If A =(|A|,E,U) is a model of KPU, then 
WF(A)F KPU, hence WF(A) is isomorphic to an admissible set. 


Proor. Ao-separation for WF(A ) follows immediately from that for A. We 
will apply the fact that a€@ WF(A) iff bG@ WF(A) for every bE 
ac =a{b: bEa} several times. E.g., it follows easily from this that WF(A) 
satisfies Pair and Union. It remains to prove Aj-collection. Let ¢(x, y, z) be 
a Ao-formula, a, e € WF(A), and assume that 

(i) WF(A)F Vx € a Jye(x,y,c). 
First of all, by 3.8 (ii) this implies that A F Vx € a Jy ey (x, y, ¢) and hence 

(ii) AEVWx € a Ay € be(x, y,c) for some b € |A|. If each “ordinal” 
in A is well-founded, then WF(A)= A and there is nothing to prove. So, 
assume B is a non-standard (i.e., non-w.f.) “ordinal”? in A. All the 
standard, i.e., w.f., “‘ordinals’’ in A are smaller than B. By (i), the y’s can 
be chosen w.f., so we have 

(iii) AEWx €ady([r(y)< Bag (xy,c)]. 
By the axiom schema of foundation in KPU, there is a ‘‘smallest”’ 
“ordinal” in A, Bo, satisfying (iii). But for every non-standard B there is 
another non-standard B’ < * B (why?), so (iii) will hold for B’ as well — so, 
Bo must be standard! Now consider b € |A| such that xEb'O xEb & 
r(x) < Bo; b' exists by Ao-separation. We have b'€ WF(A) (why?). (ii) and 
(iii) for Bo imply that A, hence WF(A) as well (why?), satisfy Vx € ady € 
b'g(xy,e). O 


From now on, let us denote by WF(A) the transitive collapse of what 
WF(A) was before. 

Next we formulate some consequences of the truncation lemma in 
connection with next admissible sets. 


3.10. CoroLLary. Let A =HYP(a) be a pure admissible set. For any 
b EA, there is a sentence Qias) in the fragment L, for some language L, 
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A-recursively depending on a and b, such that for any X-formula ¥, 
HF #[b] & KPU + gas) F w(b). 


3.10 says that the truth of a %-formula in A can be ‘“‘reduced”’ to the 
logical consequence relationship in L4. We will compare 3.10 to other 
results later. Corollary 3.10 has a proof roughly similar to that of the next 
result whose proof we will sketch. 

We call a relation B C | |" II} on M if there is a second order formula 
VP,---VWP. (x,y, P), with a finitary first order formula w% over the 
language {R, P}, such that for some parameters p in |M|, we have 
aE BOMEVP,:-::- VP. ula, p]. 


3.11. CoROLLARY. For an infinite structure IM of a finite similarity type, 
every relation on |M| that is X on HYP» is TI) on M. 


ProoF (outline). Notice that by the truncation lemma and the definition of 
HYP, any model %t of KPU that ‘“‘contains’’ Mt in a suitable sense, will be 
an endextension of HYP. Using also the upward persistence of 2- 
formulas, the truth of a -formula in A becomes equivalent to truth in all 
models of KPU that ‘“‘contain” Yt. This indicates how the second order 
universal quantifiers appear. The precise proof also uses 3.6. O 


4. Hintikka sets, model existence and %-compactness 


The basic method of constructing models is constructing them out of 
certain kinds of sets of formulas. These sets give a more or less complete 
description of not only what atomic formulas, but also what other formulas 
are true of each finite tuple of elements of the model. Perhaps Hintikka 
sets (cf. 4.1) are the most refined kind of a set of formulas among those that 
give rise to a canonical model construction, in the sense that their definition 
contains a certain minimum of requirements. 

In the rest of this chapter, by ‘“‘formula”’ we will mean a formula of L.,..,, 
for some L, unless it is clear otherwise from the context (e.g., a 2-formula 
will always mean a finitary one as before). 

We say that a formula is in negation normal form (n.n_f.) if it is built up 
from atomic and negated atomic formulas using only A, V,V and d. It is an 
easy exercise to show that every formula is logically equivalent to one in 
nn, 
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4.1, Derinition. Let H be a set of sentences in n.n.f., with at least one 
individual constant occurring in H. H is called a Hintikka set if the 
following conditions are all satisfied: 

(i) For any atomic formula @, it is impossible that both @ and — @ belong 
to H. 

(iia) For any closed (variable free) term ¢ (of the language of H), 
t=tEH. 

(iib) For any atomic formula g(x) with at most x free, any closed terms 
t,t, if e(t)€ H and t,~t,€ H, then o(t.)€ H. 

(iti) If A > EH, then » EH for every g EX. 

(iv) If Vxo(x)€ H, then g(t)€ H for every closed term t¢. 

(v) If V 2EH, then g EH for some g EX. 

(vi) If dxg(x) EH, then g(t)€ A for some closed term t¢. 


4.2. Proposition. For any Hintikka set H, there is a model Wt of H such 
that every element of || is the denotation of some closed term. 


Proor. Let L be the set of nonlogical symbols in H. Introduce the 
equivalence relation ~ on closed terms by t, ~ hart) ~ b € H. By (ii), ~ 
is in fact a congruence relation with respect to symbols in L, i.e., we can 
define an L-structure Yt on the set |Mt| of equivalence classes of ~ by 
putting 


(th) ~ ,...,tal ~ )E P™ Oa Pt-+ +t, EH, 
P(t ~ y.- 23th ~ Har ftir tal ~ - 


(Since there is at least one individual constant, |M| is non-empty.) By 
induction on the n.n.f. sentence y, we now show that ¢ € H implies MF ¢. 
For negated atomic formula ¢, we use condition (i). The easy induction 
steps using the rest of the conditions are omitted. O 


The ‘“‘crude”’ version of the notion of Hintikka set involved in the next 
result will be useful too. We call Ls a fragment of L.,,.. if it is a set of 
L.,.-formulas such that 

(i) with any » € Lz, any subformula of ¢ belongs to Ls (Lz is closed 
under subformulas), 

(ii) Ls is closed under substitution of terms of L, for free variables, 

(iii) Ls is closed under finitary logical operations 4, A, v, >, WV and 3, 

(iv) whenever VE Lg, then V{V 2—> @ : yg € 3} belongs to Ls, 

(v) each atomic formula of L belongs to Lz. 
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Notice that each countable admissible fragment is trivially a fragment in 
this sense. 


4.3. Proposition. Let Ls be a fragment of L..,.,C CL a set of individual 
constants and T a set of sentences in Ls. Assume: 

(i) Every finite subset of T is satisfiable. 

(ii) If the sentence V > is a valid disjunction and it belongs to Lx, then 
there is p © & such that ¢ € T. 

(iii) If Sx p(x) € Le and it is a valid sentence, then p(c)€ T for some 
cEc. 

Then T is complete in La, i.e., for any sentence ¢ in La, either g€Tor 
—¢ €T, and T has a model M such that every element of |M| is the 
denotation of some c € C. M is unique up to isomorphism. 


Proor. The proof is similar to that of 4.2. O 


Let L, be an admissible fragment of L.,., with a countable admissible 
set A. We now will show an application of 4.3 in the proof of Barwise’s 
Compactness Theorem, one of the main results of the whole subject. In the 
proof, we will use Barwise’s Completeness Theorem, to be proved in the 
next section. Implicitly, the proof of completeness will involve the more 
refined notion of Hintikka set. 


4.4. %-CompactNness THEOREM (Barwise). Let A be a countable admissible, 
set. Let T be a set of sentences in La, and let T be & on A. If every A-finite 
subset of T is satisfiable, then T is satisfiable. 


Proor. We will extend T to a set satisfying (i)(iii) in 4.3. Let C be a 
countably infinite set of constants, not occurring in T. Let 3x, (x,), V Dn 
(n =0,1,2,...) be enumerations of all logically valid sentences of the 
respective kinds of the fragment L4(C), the set of sentences that are 
obtained by substituting elements of C for free variables in formulas in L4 
(note that each & in L4(C) contains only finitely many constants from C). 
By induction on n < @, we will define the sets T, such that (i) T, — T isa 
finite set of sentences of L,(C), and (ii) each A-finite subset of T, is 
satisfiable. Put T)= T. Then by the hypothesis, (i) and (ii) are satisfied. 
Suppose T,, is defined. Pick any c € C such that c does not occur in T,, (by 
(i), only finitely many elements of C occur in T,) and put T’= 
T, U{,(c)}. Notice that by (i), T’ is a Z-set on A. Since 4x W(x) is valid, 
by (ii) for T, it is easy to see that each A-finite subset of T is satisfiable. 
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Now, let V X= V &,, and assume, for ‘‘reductio ad absurdum’’, that for 
every p ©, T'U{¢} fails to be A-finitely satisfiable. This means that we 
have 


AFVo €Xda(‘aCT’ & 1(Aaaz ¢) is logically valid’’). 


The Barwise Completeness Theorem in its ‘‘abstract’’ version (to be 
proved in the next section) says that the predicate ‘‘ is a logically valid 
sentence in A” is a %-predicate on A. Using completeness and the 
%-formula defining T’ with a parameter b, we can find a %,-formula 
4x 5(¢, x,a,b), with 5 being Ao, expressing the part in quotes. By 
AFV@o € 34a 3x 8(¢, x, a,b) and A,-collection we obtain that there is 
c €A such that 


AFVo € Xda €c 3x €c8(¢9,4,x, b). 
Consider 


d=a{a€c: Jeg €XIAx €ch(¢g, a, x, by}. 


By Av-separation, d € A. Let ao = Ud; we have that a)€ A. Since for 
every aE d, AFAx8(¢,x,a,b) for some y,a CT and thus ao is an 
A-finite subset of T. By the definition of ao, for every p € %, there is a Cao 
such that a U {¢} is unsatisfiable, hence, a fortiori for every p € %, ao U{g} 
is unsatisfiable. But this means precisely that a) U{V 3} is unsatisfiable, 
contradicting the fact that T’ is A-finitely satisfiable. 

We have shown that there is g © such that T’U{g} is A-finitely 
satisfiable. Define T,.:= T'U{g}. This completes the definition of the 
sequence T,, n<w. 

Put T.= U,..T,. By inspecting the construction, we see that T. 
satisfies the conditions of 4.3; (i) because every finite subset of T.. is a finite, 
hence an A-finite, subset of some T,. By 4.3, T. and T havea model. OJ 


5. Conjunctive game formulas 


Some second order notions 


A i-formula over L.,,.. (or, over L,) is a second order formula of the 


form 3R¢y(R) where R is a finite or infinite set of predicate and/or 
operation symbols, ¢ is a formula of (LU R).,. (or, of (LU R)a and in this 


case R is A-finite as well). If R- is {Ri,...,R,}, we also write 
3R,---4R,¢. If the free variables of IRg are x, IRQ =3RG(R,x), 
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then ME AR e(R, a) for a € |Me| if (naturally) there are relations R on 
|M| corresponding to R € R such that (MW, R)rex F ¢(R, a]. I! formulas 
are similar but have V in place of the 3. A IIi- or 2i-formula without 
further qualification is one such over finitary logic, L.... 

For a class K of similar structures, we will say that K is L.,.-elementary, 
or L,-elementary, or 2j-over-L,,., etc., if there is g such that ME K iff 
Mer wp andg EL.,.., or gy € La, or ¢ is %j-over-L.,., etc., resp. We say that 
K is an L.,.,-elementary class of countable structures if for some ¢ € L.,., 
MEK iff ME e and M is countable. Similarly, for the other notions. 

We say that K is A} over L.,., etc., if K is both 2; and IT} over L..,., ete. 
In short, A} = =: OI}. 

Next, fix a structure Yt of similarity type L, let P(x, y) be a formula (of 
an unspecified logic) let p be parameters in | t| (corresponding to y) and 
let B C|M|" be the relation defined by ® with the parameter p: for 
aE|M\", aE B OMe P[a,p]. If @ is finitary first order, B is called 
elementary on M. If ® is Tl}, or Il} over La, etc., we call B Ii, or Il} over 
La, etc. as well. If ® is in L.,. or in La, B is called L.,..-elementary, or 
L,-elementary, resp. Taking I= w=a(w,0,1, +,-), and a finitary first 
order formula ®, we obtain what is ordinarily called an arithmetical subset 
of w. With ®II; or £1, we obtain the IIi- or &j-subsets of w. For a general 
M, again Aj=a 21M Ii, in all possible meanings of %; and II}. 

As a third group of notions, we introduce second order generalizations 
of the notions in the second group. Let Pt be a fixed structure of similary 
type L again, let ¥(S, y) be a formula (of an unspecified kind) containing 
an n-ary predicate symbol S (‘‘predicate variable’’) not interpreted in Mt 
(but all other non-logical symbols in ¥ are interpreted in Yt), and let p be 
some fixed parameter in |X|. Let X be the set of n-ary relations S$ on 
|M| such that (Mt, S)k W(S, p). If W is finitary first order, X is called 
elementary on Xt. Again, we obtain corresponding second order notions as 
well. If W is Ili over (LU{S}).,., X is called IT} over L.,. (!) on M, etc. 
Instead of II} over L.,., we also write II; (boldface II}), etc. 

Let E be the topological space of n-ary relations on w,2°” with the 
ordinary product topology. E is homeomorphic to the Cantor discon- 
tinuum. Then for X CE, X is Borel, analytic or complement analytic in the 
classical sense (cf. Chapter C.8) iff X is L.,.-elementary, 2}, or Ii, 
respectively, on w. 

Classes of structures are ‘‘more general’’ than classes of sets, in the 
following sense. Let X be a class of n-ary relations on | Qt], p finitely many 
elements in || and consider the class X of all structures that are 
isomorphic to (Mt, p, S$) for some S € X. Results on X can sometimes be 
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derived from those on X, for the following reason. For simplicity, let 
M = w. The following L.,,.,-sentence gp characterizes w up to isomorphism: 


A Diagramw aVx V {x = 'n':n< wo}; 


here 'n’ = 1+ --- +1 is the formal term denoting n. Now, if X is defined 
by W(S, p), then X is defined by goa ¥(S, p); so, if X is M}, Zt or Al, so is 
X, etc. We will exploit this possibility several times. 


Conjunctive game sentences 


A conjunctive game formula, or a Vaught formula as we will call it 
sometimes, is a particular kind of 2j-over-L.,..-formula, but in a peculiar 
form. A Vaught formula has the form 


@(z)=Wu, A Av, V -::-Wu, A Vo, V =: 
k 


1EK, hel, knEKn fn € Ln 


A pie kan (Zz, Uy, Diy + +5 Uns Un) 

where K,, L,,... are countable sets and g “is an L,,.,-formula with the 
indicated free variables, for every k,€ K,, ,€ Li,... . The meaning of 
“@ is satisfied by ¢ in Dt’, in notation Me #[e for z] is best explained by 
an infinite game played by two players V and Jon the structure Jt. A play 
consists of an w-sequence of moves. At move n (= 1), V picks an element 
a, & |M| (interpreting u,) and an index k, € K, and J replies by picking 
b, € |M| (interpreting v,) and f, € L,. After move n is completed for all 
n=1,2,... , 3 wins iff ME A,cwge*" "[e, ai, bi,..., Qn, Dn]. A winning 
strategy for A in this game is a sequence (f,:n<w) of functions, 
fa: |M{X Kix +++ x |M|x K,—>|M|x L,, such that for any play 
(a), @2,...) of V, if A always replies according to his strategy, i.e., by 
choosing b,, |, determined by (b,, l,) = f.(@1, ki,..., Qn, kn), then A wins in 
the above sense. Finally, Jt ®[c] means that J has a winning strategy in 
the game on (M,c), associated with @. 

Let us first make a series of easy remarks on conjunctive game sentences. 
First of all, one might want to consider a less regular prefix as the one 
exhibited, or some of the quantifiers V, 3 and A, V (which can be 
considered quantifiers on the fixed sets K,, L,) might be entirely missing. In 
all of these cases, by inserting dummy variables one can easily bring the 
formula to the exact form exhibited. 

This remark is valid even if one takes a finite prefix to begin with, and 
e.g., one considers a prenex formula. Then one realizes that the truth- 
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definition via the game interpretation is nothing but the familiar Skolem 
form, with Skolem functions for the existentially quantified variables. This 
remark easily leads to recognizing the £} character of conjuctive game 
sentences in general, but we will return to this point below. 

For the next remark, let us start with the Vaught formula ® exhibited 
above, and for simplicity, let us make it a sentence, i.e., z the empty 
sequence. Then for any fixed indices k,,...,/, from K,,...,L,, resp., we 
consider the truncated game formula ®*'"*(u,,..., v,) obtained by delet- 
ing the quantifiers from the beginning up to and including Jv,. Let p be the 
deleted part of the prefix. Now, p®*"""(u,,..., v,) can now be interpreted 
in two ways; one is the original ® and the other is “‘p applied to 
(Pion:sk, € K,,...,1, € L,).” The reader should convince himself that the 
two meanings are actually the same. 

As a next remark, we state: 


5.1. Proposition. For the Vaught sentence ® as above (for simplicity, 
z=90), MED iff there are predicates P*'™(u,,....0n) (n<o, 
k,€ Ki,...,h € Ln) on | M| such that the structure W' obtained by expand- 
ing IM by these predicates satisfies the sentence 


P® a A °° A Weyer Won PAO (dy, 22, On-1) 
len<w L kiEK, In -1E Ln 
Vu, A Ata yo PY (u,..., Un-1s unt») | 
kn€ Kn hELy 
AA Act A Vu Won( PHA. Un) 
Isn<qm k,;EK, hELn 


>A eo "(u,..., 0) ) ; 
Proor. For the only if part, notice that one can take P“'” to be @*"""* as 
defined above. Conversely, supposing the P“'”" exist with the required 
properties, one defines by induction on n the functions 


fax: K:.X|M|X-+-- XK, x |M|>L, x |M| 


such that P*"”(a,, b,,...,@n,b,) holds for any ai,...,a@,€|M|, ki€ 
K,,...,k, © K, and for (bl, b) = fi(ai, ki,..., ai, k;). For this, one uses the 
first and second conjuncts of the sentence exhibited above. Using the last 
conjunct, it immediately follows that the f, form a winning strategy 
ford. O 
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We say that the Vaught formula @ is A-finite (or is in A) if the family 
(eo tn <<, ky € Ki,...,h€ Ln) is A-finite. 


5.2. COROLLARY. Any Vaught formula is logically equivalent to a Xi- 
formula over L.,,.. If A is an admissible set, and the Vaught formula ® is 
A-finite (in particular, w € A), then @ is logically equivalent to a X}-over- 
L,-formula. 


The first main result tells us that, as far as countable models are 
concerned, conjunctive game formulas have the same expressive power as 
di-over-L.,.-formulas. What makes us consider the more complicated 
notion of conjunctive game formulas is the fact that their special syntactic 
form lends itself to useful manipulations, in particular, the construction of 
the approximations, cf. below. 

We will use the notation F’ ¢ to indicate that ¢ is true in all countable 
models. By the Downward Léwenheim-Skolem Theorem tor L.,.. (cf. 


Chapter A.2) for an L.,.-sentence g, F’g is equivalent to F ¢. 


5.3. GAME-FORM THEOREM. For every %}-sentence 4R (R) over L.,. there 
is a Vaught sentence ® involving only symbols from L_ such that 
F’®<>ARQ(R). If A is an admissible set, w € A and IR @(R) is 3} over 
La, then ® can be chosen to be A-finite. Actually, there is a % operationon A 
giving a suitable ® for any X}-over-L,-sentence as argument. 


Proor (HaArNnIK [1974]). We may assume that ¢(R) is in n.n.f. Let 4 bea 
countable fragment (cf. above) of L(R).,. containing g(R) and x ~ y (for 
the purpose of the “‘A -finite’’ case of the theorem, we have to take A to be 
A-finite; the smallest fragment containing g and x ~ y will be suitable 
(exercise ; this uses the fact that L(R) is (can be taken to be) A-finite and 
that w € A)). Let C be an infinite set of new individual constants and A (C) 
the set of sentences that are substitution instances of formulas in A with 
elements of C substituted for the free variables. Let D be the set of all 
closed terms of the language L UR UC. Now, we claim that for any 
countable L-structure I, YK AR y(R) iff the following is satisfied. 

(i) There is a Hintikka set (cf. 4.1) @ CA(C) and an onto function 
f:D—[M| such that g(R)€ @ and for all atomic formulas a(vo,..., Un) 
of L, and alll ¢o,..., Cn € D, ME wf (Co), .. -, f(Cn)] iff (Co, .. -, Cn) € O, and 
either 7(Co,...,€n) OF —71(Co,..., Cn) belongs to @. 

In fact, if ME AR —(R), i.e., (MW, R)F ¢ for certain relations R on|M|, 
then let f be any onto function C | t| and define @ to be the set of all 
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$(Co,..-, Cn) € A(C) in n.n-f. such that ME p[f(co),...,f(c.)]. It is easy to 
see that these choices will satisfy (i). Conversely, assume (i). Let Dt’ be the 
canonical model of @, constructed in the proof of 4.2. Then 2?’ o(R). By 
the definition of Dt’ and by (i), we see that the function c+ f(c) is an 
isomorphism of Dt’ | L and M, establishing ME AR g(R). 

Next we express condition (i) on 2 in the form of a Vaught sentence ®. 
® is defined as 


®=Vuv A Jn 


A Vv 
ED d\eD 8;E4(C) 01 4(C) 


Vv Uo: ( A N 619181918, (u,, U1,..., Un, v.)) 


n<w 


(in the prefix, blocks like Vu, --- V4,<acc appear for each n = 1,2,...) and 
the formulas N“"* are defined as follows. There are two cases. Either 
every one of the conditions (ii}-(v) below on (¢1,..., 0,) is satisfied (case 1) 
or not (case 2). In case 2, we put N“* to be identically false, V @. In case 
1, we put 


No ou, bees Un) = 


= A{a(u,...,0.): @ is an atomic or negated atomic formula 
of L and a(ci, di,..., Cn, dn) € {61,---, On}}- 


The conditions are as follows; we put @, = {¢, @:,..., 9n-1}- 

(ii) For no atomic a, both 7 and — 7 belong to @,. 

(iii) If 6.€ @, and 6,= VW or 6, =Jvy(r), then 6, = for some 
% € W, or 0, = &(t) for some closed term ¢, resp. 

(iv) If 6,¢ ©, but either (a) 5, is ¢~t or (b) 6, is e(t2) such that 
t= t2 © O, and ¢g(t,) € @, or (c) 6, € = for some = such that A = E O, or 
(d) 5, = g(t) such that Vx o(x)€ @,, then 6, = 6,. 

(v) If 5.¢ @,, 6, is an atomic formula of L and none of (iv) (a), (b), (c), (d) 
is the case, then 0, = 6, or @, = 7 6,. 

(Intuitively, @, is an “initial segment” of a Hintikka set, 6, is the next 
formula to handle and 6, is the next formula to be put into the Hintikka 
set.) 

Let’s show that Yt satisfies (i) iff PCE ®, for a countable Mi. It is easy to 
see that (i) implies DiF ®. Indeed, let © be a Hintikka set satisfying (i). 
This is how we define a strategy of 4. c, € D and b, € | M| (interpreting 
v,) are to be chosen by J so that f(c,) = a, = choice of V interpreting u,, 
and f(d,) = b, where d, € C is the choice of V in move n. Finally 9, is to be 
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chosen by J always in © and satisfying the requirements (iii}-(v); that this 
is possible follows from the fact that © is a Hintikka set and 
{p(t), 0:,..., 0.1} (chosen so far) is a subset of @. Again from the Hintikka 
character of @, we will have that (ii) is satisfied. By the rest of fi), we have 
that ME Me" [ai,...,b,], for any choices of V and strategic choices of J, 
proving that the exhibited strategy is winning. 

Suppose next that Yt? ® and let us fix a winning strategy of 3 in the 
corresponding game played on 2. Let the strategy of J be pitted against a 
play of V in which 

(vi) V exhausts |9t| by choosing a, (interpreting u,) so that |P|= 
{a,:1<n<w}, 

(vii) every element of C occurs as some d, (chosen by V at move n) and 

(viii) every sentence in A(C) occurs infinitely often as some 6,. 

Then the resulting play will be a sequence 


a, C1, di, bi, 51, 01,... 
Viavdivd93 


(with the players indicated) and it will give us 0 ={0,:15n<w}U{o} 
that is a Hintikka set containing gy. Indeed, since we have 

(ix) MeN *{a,,...,b,] for each n, N*’* cannot be identically false. 
Hence case 2 above does not occur (for the sequences (c:,...,9,) given by 
the play), hence all the conditions (ii}-{v) hold. By (ii), 4.1(i) will hold. By 
(iv) and (viii), 4.1(ii), (iii) and (iv) will hold: e.g., if Vx g(x) € © and t isa 
closed term, then for some n, Vx g(x) € O, and y(t) = 5, (by (viii)), hence 
by (iv), 0, = g(t) & @. By (iii) and (viii), we similarly conclude that 4.1(v) 
and (vi) hold for @, hence © is a Hintikka set indeed. 

Furthermore, the set 


f=(cy Qn): 1en< ow} U {(d,, b,):1sn<ew}CDx|M| 


has domain and range D and ||, respectively, by (vii) and (vi). By (ix), 
the definition of the N‘“”®, (v) and (viii) we obtain that 


Me wles,...,e,] iff wfi,...,f.) is in O, 


whenever (e,, f;) © f, m(x1,...,X,) is an atomic formula of L. This implies 
that f is a function and also that the last part of (i) holds. This completes 
showing that tk © implies (i). 

We completed showing that F’ IRg(R)<@. From the above, one can 
extract a proof that 3R g(R) implies © even on an uncountable structure 
M (exercise ), but the converse direction does need the countability of M. 
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It is routine to verify that ® will have the additional properties as 
well. O 


Approximations of conjunctive game formulas 


The usefulness of Vaught formulas is mainly based on their approxima- 
tions invented by Vaught (but also consult the historical remark at the end 
of this section). Let ® be the general Vaught sentence exhibited above; 
again, we take z to be the empty sequence, for the sake of simplicity. We 
will use the abbreviation k for (k;,..., 1.) (hence, for n = 0, k is the empty 
sequence). For arbitrary 0<n<w, arbitrary k@K,X --- XL, and an 
arbitrary ordinal a, we define the L.-formula ®£ = @*(u,,...,v,) by a 
simultaneous induction on @ as follows: 


pk = A N*: 4 


isn 


k k 
@*., = Vuns1 A Avast V DE Knettney 


kn+1E Knot In+i1EL net 
@k = ~n @§ for a limit A. 


Clearly, for countable a, @£ is a formula of L.,.. Moreover, ®£ as a 
function of ®, k and @ is a Z-operation on any admissible set (by 
>-recursion). In particular, if A is admissible, if ® is A-finite, so are all 
approximations ®* for a < Ord(A).We write ©, for ©%. 

If the prefix of the Vaught formula is different from the standard form, 
the approximations are appropriately modified. E.g., if ® is 


Vx, dy, Vx2.dy2°-> A @"(X1,-- + Ya)s 
then the approximations are 
D3 = A O'(X1y +--+ Vi)s 
Di = VXne1 Dn O27", 


PL= a M5, A limit 


a>a 


5.4. THEOREM (VauGHT [1973]). With the above notation, 
(i) Fora >B, —& Bi > B5. 
(ii) For every a, F D> @,,. 
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(iii) For every structure M of power =x, ME D,+— ®. 
(iv) In fact, if A is admissible, and both ® and the L-structure M are 
A-finite, then DE Boraay— ®. 


ProorF (i) is left as an exercise. 

(ad (ii)) Recall the “truncated” game-formulas ®* (u,,..., v,). (ii) is 
proved as a special case of F Wu, --- v, [®* — ®£ ] which in turn is proved 
by a straightforward induction on a. 

(ad (iii)) (iii) follows from (iv) (why?) but (iii) is easier to show than (iv). 

(ad (iv)) We first prove the following fact: 


Me Vy ++ o9| A PVs 


a<Ord(A) 
A Aunt Vv A DE kn ste IF 
knvt€Kney Ins1€Lns) a<Ord(A) 


here k,€ Ki,...,4 € L, and k =(k,,...,h,). Let ai,..., b, © | M| be arbi- 
trary (interpreting u,,..., v,) and assume (contraposition!) that the formula 
after — does not hold for a,,..., b,, i.e., 


(v) A F Fun+s E | M| Fkn+t E Kasi Vone1 E | Me | Vines E Lisi 
da [Ord(a) a “ME A DE knsvlnes (€1,.. +5 Day Un+i, Unvr)’’)- 


By the remarks made above on ®£ as a function of a and k, and by the 
A-character of the truth definition of L... in KPU, it follows that “---” isa 
A-predicate of all the variables involved, including a. The only unre- 
stricted quantifier in (v) is da. Hence, by %-collection there is a set x EC A 
such that in (v) Ja can be replaced by Ja € x; x can clearly be taken to be 
an ordinal ao < Ord(A). By (i), then da can be actually eliminated and the 
subscript a@ can be replaced by ao. By the definitions involved, the 
conclusion thus obtained is identical to saying that Wr 
4 @* .,[a:,...,b,], and this establishes the claim. 

Now, assume that IE Doraay, i.e., DUE Aa<ora) 22. We will use 5.1. 
Define the predicate P* (u,...,0,) to be Na<ora) P% The first conjunct 
of the sentence in 5.1 holds by the assumption, the second by the claim and 
the last by the definition of 6§ O 


Notice that 5.4(ii) and (iii) say that F’ ® @ A.<x, ®.. Taking into account 
the game-form Theorem 5.3, we obtain: 


5.5. COROLLARY. Any %j-over-L.,,.-class of countable structures is the 
intersection of ®, elementary -over-L.,.-classes. 
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By remarks made at the beginning of this section, as a special case we 
have: 


5.6. COROLLARY. Every analytic set in the Cantor space is the intersection of 
&, Borel sets. 


5.7. REMARK. One reason why we are missing admissible sets with 
Ord(A ) = @ (i.e., o€ A) is that the assumption w € A is used throughout 
this section. This could be avoided, however, at the cost of a little extra 
work. The main modification that would be required is to use A-rec. 
Vaught formulas instead of A-finite ones. ® (as above) is called A- 
recursive iff for each n € w the function F, = (g*""":k, € Ky,...,h € Ln) 
is A-finite and (F,: n < w) is A-recursive. Then in 5.3, for any admissible 
set A, ® can be chosen to be A-rec. for A-finite IR g(R), and in fact, the 
“index” of the A-rec. ® (in a suitable sense) is an A-rec. function of 


IR ¢(R). 
Next one notes that for A-rec. ® the family (@£ : a < Ord(A), n < o, 
k E@K,xX-:: XL,) is an A-rec. function (in particular, each approxima- 


tion @* is A-finite, for a € A). 
Also, 5.4(iv) remains true, for A-recursive ® with essentially the same 
proof. 


5.8. HistoricAL REMARK. The material in this section has a complicated 
history. The game-form theorem, formulated for finitary &}-sentences, is 
due to SvENontus [1965]. Vaught discovered the present form, but then he 
actually realized that the present form is a consequence of Svenonius’ 
theorem (cf. VAUGHT [1973]). Meanwhile, and earlier than Vaught, Mos. 
CHOVAKIS [1970] rediscovered a version of Svenonius’ theorem, although 
strictly speaking his result is weaker. However, Moschovakis discovered 
(cf. the same reference) a version of the approximations of game sentences 
earlier than Vaught; he used his version to give a proof of Il; = inductive 
on a countable structure. The main difference between Moschovakis’ and 
Vaught’s works is that Moschovakis worked with a fixed structure. 
Accordingly, he essentially shows the game form theorem and the theorem 
on approximations, 5.4(iii), only in a ‘‘local’”’ form. On the other hand, the 
proof of Vaught’s version is very close to Moschovakis’ proof. The same 
thing is happening in both; Vaught saw more of what this thing was. We 
can also say that Vaught’s work is a “uniform” version of that of 
Moschovakis. 
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6. Applications of game formulas; completeness 


6.1. THEOREM. Let A be an admissible set, A C HCy (hence, every set in A 
is countable). Then the truth-definition of A -finite I1}-sentences in A -finite 
models is a %-predicate on A. L.e., there is a X-formula o(x, y, z) such that 
for an A-finite Il!-sentence WRo(R) and an A-finite structure M, 


MEVRoe(R)a] iff AF o[VR(R), MN, a]. 


Moreover, the same o works for all A containing w. 


Proor. Of course, the claim is equivalent to saying that “‘%j-truth is II on 
A’”’. This is an immediate consequence of our work in the last section. For 
3R¢(R) in A, let ® be a Vaught-equivalent of IRy(R); @ is a 
%-function of IR ¢(R) (cf. 5.3). For M in A, we have ME IR y(R) iff 
Mr D (since M is countable) iff for all ordinals a € A, ME ®,, by 5.4(iv). 
This proof is valid for the case w € A; for the general case, use 5.7; now we 
should take seriously our vague remark on “indices” in 5.7. O 


6.2. BARWISE’s ABSTRACT COMPLETENESS THEOREM. For A as in 6.1, the 
predicate ‘‘g is logically valid’, for sentences p in La, is % on A, uniformly 
for all A 


PRooF (VaAuGHT [1973]). Our proof will work only if # € A (but see the 
remarks below). By the downward Léwenheim-Skolem theorem, any 
sentence in L.,., is valid iff it is true for all structures with domain », for any 
v <w. v itself is a structure with the empty similarity type. Let g € La, let 
R denote the set of non-logical symbols in g. Then ¢ is logically valid iff 
Vv <wvkFVRQ(R). Now apply 6.1. O 


Remark. If there is at least one infinite set in A, this proof can be saved. We 
have to use models with possibly non-standard interpretations of equality 
— but that is no loss. The remaining admissible sets are exactly the HFu, 
for various U, and the logic is exactly finitary logic. Even in this case, the 
proof can be “‘saved’’ by “adjoining” an infinite set to A; details are 
omitted. 


Remark. Let us emphasize that 6.2 is not the full Completeness Theorem 
of Barwise (stated in the Introduction!). On the other hand it is interesting 
that the abstract completeness theorem for L, plays a crucial role in the 
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model theory of L4, much more so than in the model theory of ordinary 
first order logic. 


Let Valid,(¢) denote the predicate ‘ty is a logically valid A-finite 
sentence of some fragment L,”’. There are countable admissible sets A 
such that Valid, is A on A; e.g., if A <; HC (exercise). But next admissible 
sets behave like HF in this sense: Valid, is not A, and in fact, it is a 
“complete” A-r.e. set. Let A =a* be a pure next admissible set with 
wEA. Let g(x:, x2) be a % formula, let BCA be the set defined by 
y(x1,c) on A, c being an element of A. We will now refer to 3.10. The 
infinitary sentence (A KPU A @(a4-))—> 9(b, c) is an A-rec. function of b,c 
and the formula gy. Denote this function by &(b,c,@~). 3.10 says that 
bEBS (b,c, ¢)€ Valid,, expressing the “completeness” of the 2 
predicate Valid, with respect to all 2 predicates on A. Now, by the Cantor 
diagonal argument, the predicate Q(d) © “‘d =(c,¢~) forsome c € A and 
-formula @ and &((c, g), c, e) € Valid,” is not &. But of course, Q is II, 
hence, the & predicate — Q(d) is not II, hence it is not A. Thus, there is a 
predicate on A that is not A, hence by the ‘‘completeness” of Valid,, we 
have proved 


6.3. THEOREM. Fora pure countable next admissible set A = a* withw € A, 
Valid, is not A on A. 


Next we state the converse of 3.12 for countable WM. 


6.4. THEOREM (BARWISE [1975]). Let It = (| Mi], Ri,..., R:) be a countably 
infinite structure of a finite similarity type and let S be a relation on ||. Let 
A = HYPw. 
(i) If S is Ti over La on M, then S is Y on A. 
(ii) S is TIi on M iff Sis Y on HYP. 
(iii) S is Al on M iff SE HYPx. 


Proor. (ad (i)) This follows immediately from 6.1 since Pt € A. 
(ad (ii)) This follows from (i) and 3.12. 
(ad (iii) This follows from (ii) and the A-separation principle. O 


If Dt = (w, 0,1, +,-), then HYPy is, for all practical purposes, the same 
as HYP(w); e.g. the pure sets in HYP. are exactly those in HYP(w) 
(exercise). So we obtain the following result. 
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6.5. COROLLARY (KRIPKE [1964], PLATEK [1966]). The II) subsets of w are 
exactly those subsets of w that are HYP(w)-r.e. The Aj subsets of w are 
exactly the HYP(w)-finite subset of w. 


To emphasize the historically very important 6.5, we give a concrete 
description of HYP(w). Let us call an ordinal a recursive if there is a 
recursive ordering of a subset of w having order-type a. Note that if a isa 
recursive ordinal and B <a, then @ is recursive too. Let wf* denote the 
first non-recursive ordinal. 


6.6. THEOREM (KripKE [1964], PLaTeK [1966]). w*=Lus, ie. the first 
admissible ordinal > w is w{*. 


Proor. Let HYP(w) = L.(w) = L.. By the 2-recursion theorem, it is easy 
to show that the order type of a well-ordering in an admissible set belongs 
to the admissible set itself. Since all recursive sets belong to HYP(w) 
(exercise), it follows that wf" <a. 

Now assume that wf < a, i.e., B = wi“ € HYP(w). We use the notation 
W. for the eth r.e. set and the following classical facts: 

(i) for every Ili set A Cw, there is a recursive function f such that 
nEA © f(n) is the index of a recursive well-ordering W,,,) of a subset of 

(Suslin-Kleene normal form, cf. Chapter C.8), and 

(ii) there is a II} set Cw that is not =i (cf. Chapter C.8). 

Let B be a Ij subset of w that is not Xi, and apply (i). Since for each 
n & B, the order type of W,,,) is < B, there is an order preserving map g of 
the ordering W,,) into (8, € |B). Of course, the existence of such a g 
implies that n € B. The existence of g can be expressed by saying that an 
A-finite structure, A-recursively depending on n, will satisfy a certain 
Di-sentence over L, (exercise). Putting those facts together with (the dual 
of) 6.1, we obtain that B is II on A, hence by (the dual of) 6.5 that B is aX} 
subset of w, which is a contradiction. 


6.6’. COROLLARY. The Aj subsets of w are exactly those that are constructible 
before w<*. 


Remark. Corollary 6.6’ is a version of Kleene’s theorem (KLEENE [1955]), 
one of the most important results of effective descriptive set-theory and 
(applied) second order logic. Kleene’s theorem says that a subset of w is 
hyperarithmetic iff it is Ai. If we change Kleene’s definition of hyp. to “to 
be an element of LS*”, 6.6’ amounts to the modified Kleene theorem. 
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Actually, the two defi: itions of hyp. are similar; both involve a hierarchy 
indexed by recursive ordinals. As we see, 6.5 generalizes to arbitrary 
countable structures in 6.4. Kleene’s theorem itself is a more refined and 
deeper statement that does not similarly generalize. Cf. a discussion of this 
point in Moscuovakis [1974]. 


We now state without proof a result that shows what role the first stable 
ordinal o> plays in effective descriptive set theory. 


6.7. THEOREM. (i) @o is the first non-A}-ordinal, i.e., the first ordinal a such 
that there is no A} well-ordering of a subset of w with order type a. 

(ii) For A = L,,, a subset B of w is 3} iff it is X on A. 

(iii) B Cw is A} iff BEA. 


We next state an important result whose proof uses the Barwise 
Completeness and Compactness Theorems, and also, the omitting types 
theorem (cf. Chapter A.2). A set a is called internal for a model % 
satisfying the extensionality axiom if a € WF(X). Let A =HYPm, for a 
countable structure %. Let T be a theory in Ly, with L={€E,U,...}, let T 
be % on A. 


6.8. THEOREM. If a is internal for every model of T, thenaE A. 


Theorem 6.8 is a modern version of the Gandy-Kreisel-Tait Theorem: 
For any consistent IIj set of axioms for second order number theory, if 
a Cw is “in” every model of T, then a is hyperarithmetic. 

For a proof and history of 6.8, cf. BARwisE [1975], Chapter 4, Theo- 
rem 1.3. 

Finally, let us summarize the completeness and %-compactness 
Theorems for a countable admissible set A. The next theorem will be the 
basic tool in the next section on 2-saturated structures. We leave its proof 
as an exercise. 


6.9. THEOREM. Let T be a % theory in a countable admissible fragment La. 
Then for any ¢ © La, 
(i) TF 9 iff for some A-finite T', T'F g, 
(ii) {9 €L~,: Teg} is % on A (extended completeness). 
(iii) Let pCIXL, be Xa, family such that for each i€I, p= 
{u: (i, w) © p} is a set of Ls sentences. Then the family {(i, p): i € I, pF ¢} 
is % on A (uniform extended completeness). 
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7. ~-saturated structures 


Throughout this section,-A is a countable admissible set, L is a language 
that is a A subset of A. A &-family of L4-types with parameters x is, by 
definition, a % set p such that p CI X L, for some A-finite index-set I, and 
if (i, #) € p, then & = #(v,x) is an L,-formula with the free variables 
indicated. For i € J, let p, or p,(v, x), be the “type” {w(v, x): (i, &) E p}; pi 
is a & set of formulas. 

Let 2% be an L-structure and consider a %,-family of La-types with 
parameters in M, i.e., with some elements a in | 22] (or, constants denoting 
them) substituted for x which were the parameters before. We talk about a 
family of types over It. We say that p is realized in W if, using the above 
notation p;,, Vt= Av Vier A pi(v,a)i.e., at least one type p; in the family is 
realized. 


7.1. DEFINITION (RESSAYRE [1976], in present form: HARNIK [1974]). Dt is 
Y,-Saturated iff for every X,-family p of types over Mt, if every A-finite 
subfamily q C p is realized in M, then p itself is realized in YM; in notation 


M = [A Juv A (0, a) | 30 vA pi(v,a). 

qcp ier ier 

qeA 
Remarks. In case I is a singleton, the infinite disjunction disappears and 
we obtain a condition formally close to the familiar No-saturation of model 
theory. In fact, for A = HF, we obtain what is called recursive saturation 
(actually, r.e.-saturation, but the two are equivalent) in Chapter A.2. Since 
a finite set I is not essentially more general than a singleton (exercise), 
Sur-Saturated = recursively saturated. 

An equivalent definition of a 2-saturated model is obtained by “break- 
ing up” the original definition into two parts. The first part is the original 
condition with I a singleton (this condition is called 2-saturation by SCHLIPF 
[1976], actually more appropriately than the full condition) and the second 
part is obtained by deleting x and 3x from the original (this second 
condition is called A-regularity by ScHipF [1976]). 

The reader will see that in the proof of 7.2 below, it is A-regularity that 
is needed ‘‘to handle disjunctions’”’. 

Every finite L-structure is 2, -saturated. For any set a, the structure (a) 
of the empty similarity type is &4-saturated (exercise: note the scarcity of 
definable subsets of (a) in finitely many parameters). There are many 
%,-saturated structures as Corollary 7.3 shows. The following is the 
fundamental theorem on 2-saturated structures. 
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7.2. THEOREM (Strong relation universality of %-saturated structures, 
RessAyreE [1976]). Let T’ be a %,-theory in a language L', such that 
L’DL. If M is a countable X%,-saturated L-structure satisfying all L.- 
consequences of T', then I has a Xa-saturated expansion WM’ (ie., 
M | L=M) that is a model of T’. 


Proor (Harnik [1974]). Enlarge the language L’ to L’(| 2t|) by introducing 
names for the elements of | |. The name of a € | P| will also be denoted 
by a. Let p®,p',...,p",... be a list of all 24-families p(v, a) of L’, types 
over Jt (note that A and M are countable). We will define by induction the 
sets @,C @; C --- such that, among others, we have (i) and (ii) below. 

(i) O, isa X4 set of L4(| Pt|)-sentences involving altogether only finitely 
many constants from | Jt| (notice that because of this condition, there is no 
difficulty with the meaning of 0, being 24 even though | P| is not a subset 
of A: one replaces the finitely many constants with finitely many free 
variables). 

(ii) If O, F W(a), w(x) is an L,-formula, and a € | P|, then M= y[a]. 
Our aim is to construct the @, so that U,,-., @, will define (by 4.3) a model 
of T’ that will be isomorphic to a 2, -saturated expansion of De. 

Put @, = T’. (i) and (ii) are satisfied by the assumptions of the theorem. 

Assume @,, has been defined. Let p” = p(v,a) CI X La be the nth family 
of types to be considered, a a list of all |2%| constants involved in either 0, 
or p". In constructing 0,41, we make sure that the saturation condition will 
hold for p". 

For i€ J, define r,(v,a)={W(v, a): b(v,x) is an L,-formula (!) and 
6, U pi(v,a)F wv, a)}. By the extended completeness theorem (6.9(ii)), 
each r, is a 4 set, and actually, by uniform extended completeness 


(6.9(iii)), 


r={(i,b): b Er, i € I} 


is a %, family of types. Consider the following cases: 

Case 1. M realizes the family r. Let b € |M| and i EI be such that 
MEA r(b,a) Take O,.,= 0, U p,(b, a). Notice that, by the definition of 
r, conditions (i) and (ii) continue to hold for 0,41. 

Case 2. M does not realize r. Since M is %,-saturated, there is an 
A-finite family s Cr such that s is not realized in Mt. Since s, Cr, for each 
i€I and s, is A-finite, we obtain by 2-compactness that 


(iii) AFWIET3t[t Cp a0, FVv(A t(v,a)> A 5,(v, a)]. 
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By extended completeness, the predicate in brackets (of the variables ¢ and 
i) is & on A. By &-collection, there is a set w € A such that for every i € I, 
the ¢ can be chosen in w, for every i€ I. But then for q =(Uw)Np, 
we have 


AFViEI(qCpnGO, FWv (A qlv,a)—> A Si(v, @)). 
In other words, 
(iv) 0,40 V A s(v,a) > 740 V A qi(v,a). 
ier ier 


In case 2, let us put 0,4,= 0, U{ 74v Vier A qi(v,a)}. Since s is not 
realized in It, the formula before — in (iv) is true in Dt. Hence, by (iv), any 
L, consequence of @,., is a consequence of @, plus an L,4-sentence true 
in 2. But then the condition (ii) will be inherited from O, to 0,41. So (i) 
and (ii) hold again. 

Having defined @, for each n < w, put @, = U,..,0,. We claim that 0, 
satisfies (i)}(iii) in 4.3. The induction hypothesis (ii) implies that 0, is 
finitely consistent. If V ® is a valid disjunction in LA(|Dt|), then the 2, (in 
fact, A-finite) family p C ® x L, of types p, = {y, v = v} occurs as some p". 
When defining @,.,, case 1 must occur since V ® is valid. So 0,4: = 
0, U{e} for some ¢ € , establishing 4.3(ii) for @.. Checking 4.3(iii) is 
similar. 

By 4.3, @. has a model XM’ each of whose elements is denoted by a 
constant in | Mt| and such that MF o iff o € O., for o € La(| Me). Since for 
an L,-formula (x), Dt'E f(a) implies y(a) € O, for some n < w, hence 
by (ii), P= p[a], it follows that a+ a™ is an isomorphism of PM onto 
M'|L. Without loss of generality, we can assume that ge |L=m and 
a™ = qa. Since Oo = T’, M' ET". Finally, let p = p"(v, a) be any Z, family 
of types over Dt’. Notice that according to the construction of 0,41, in case 
1, p is realized in Qt’ by the element b and in case 2, p is not even 
A-finitely realized in 9’. This means that YP?’ is 2 ,-saturated. O 


7.3. COROLLARY (Existence of %-saturated models). Any consistent %,- 
theory T in La has a %,-saturated model. 


Proor. Take a finite or countable model 2% of T. Considering the 
structure (|9%|) of the empty similarity type Lo, every sentence of L» that is 
a consequence of T is true in (| Dt |). Since (| Dto|) is £4 -saturated, by 7.2 it 
has an expansion that is a £,-saturated model of T. OJ 


An L-structure % is called 24-relation universal if for every X,4 theory 
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T’ in La, for some L’3 L, if Pt satisfies every L,-consequence of T’, it has 
an expansion that is a model of T’. A weakening of 7.2 is: 


7.4, COROLLARY (%,-relation universality of 2, -saturated models). Every 
Xa-saturated countable structure is X,-relation universal. 


The Corollaries 7.3 and 7.4 by no means exhaust the full power of 7.2 as 
Ressayre’s work amply demonstrates, although for many admissible sets A, 
~a-relation universality does in fact characterize £,4-saturated models. 


7.5. COROLLARY. Any A-finite structure is %,-saturated. 


Proor. This is easy to prove directly, but it also follows from 7.3: one uses 
the sentence A Diag (Xt) a Wx V ew x =a that characterizes J up to 
isomorphism. 0 


7.6. THEOREM (RESSAYRE [1976]). Let I be a countably infinite X,- 
saturated structure of an A-finite language L. Let m be an infinite set, 
m€A. Then there is an admissible set BD A such that B contains an 
isomorphic copy of Yt with underlying set m and such that Ord(B)= 
Ord(A). 


PRrooF (Outline; see HARNIK [1974]). Let T be a %,-theory, in a language 
including L, €,U, constants 'a‘ for a € A, and some other symbols, 
whose models, when reduced to L U{€, U}, are exactly those 3? whose 
€ ,U-reduct 3, is a model of KPU such that Jt, is isomorphic to an 
end-extension of A and ¥t, contains, as an ‘“‘element’’, an isomorphic copy 
of the L-reduct of Jt with underlying set m. Of course, any infinite 
L-structure can be expanded to a model of T. Since 2? can be expanded to 
a model of T, by 7.2 it can be expanded to a 2, -saturated model 9 of T 
(why?). We can assume that (the € , U-reduct of) 3t is an end extension of 
A. The w.f. part of %t is an admissible set B D A containing, as an element, 
an isomorphic copy of Yt with underlying set m. It remains to show that 
Ord B <= Ord A. Let a be any “ordinal” (if any) of 3% such that each 
a <Ord(A) is ‘‘smaller’” than a. Consider the %,-type 


{“v is an ordinal’’} U{a <v:a<Ord(A)}U{v < a}. 


It is clearly A -finitely satisfiable in 90, hence by 2-saturation, it is satisfiable 
in 3t. The conclusion is that among the “ordinals” a in M that are 
“greater” than all a < Ord(A) there is no ‘‘smallest” one. Hence Ord(A) 
cannot belong to B, hence Ord(B)=Ord(A). O 
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In the next theorem, we collect some results characterizing 24- 
saturation for certain admissible sets A. We omit the proofs although they 
don’t use anything going beyond the foregoing material. 


7.7. THEOREM. Let A be a resolvable countable admissible set, i.e., let A be 
the union of the range of an A-recursive function defined on the ordinals of 
A. Then the conclusion of 7.6 characterizes %,-saturated structures of an 
A-finite similarity type. If A = HYP(a) fora pure set a, Wt is an L-structure, 
LEA, then M is ¥4-saturated iff for B = HYP({M, a}), cdnsidered in Vix, 
we have Ord(B)= Ord(A). As a consequence, for such A, %,-saturated 
structures are defined by a single %}-over-L, sentence and they are also 
characterized by %,4-relation universality. 


In view of 7.7, we could have given alternative definitions of 24- 
saturated structures, at least for certain A. In fact, the existence theorem 
7.3, with one of these definitions of -saturation, goes back to FRIEDMAN 
[1973]. But none of these definitions give us any hint why the following 
simple but fundamental property holds for %,-saturated structures. 


7.8. THEOREM. The union of an L,-elementary chain of %,-saturated 
structures is %-saturated. 


Proor. The proof is entirely trivial on the basis of the original 
definition. O 


Ressayre uses 7.8 along with 7.2 to give a new proof of Keisler’s 
Theorem (KEISLER [1970]) saying that the set of logically valid sentences in 
L4(Q), the logic L, augmented with the quantifier Qx “there are 
uncountably many x...”’, is 24, for any countable admissible A. He also 
proves a general theorem that has the following theorem of GREGORY 
[1972] as a consequence: for any countable admissible A, if T is a 
Z,-theory in La, and T has a pair of models Mt, M such that MF. N, then 
T has an uncountable model. 

Finally, let us give a proof of a theorem due to G. Sacks and proved 
subsequently in a simpler way by Friedman and Jensen, cf. also KEISLER 
[1971]. We call an ordinal a admissible if the set of constructible sets up to 
a, L., is admissible. 


7.9, THEOREM (Sacks). For every countable admissible ordinal a there is a 
subset a of w such that a = wi =a Ord (HYP(a)). 
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ProoF (RESSAYRE [1976]). Let A =L,. be admissible. Let T be the 
,-theory in the language containing only the binary predicate symbol < 
whose axioms are 

(i) “‘< is a linear ordering”, 

(ii) ‘‘there is x such that {y: y < x} has order type B” for all B < a (it is 
a good exercise to show that there is an A-finite sentence expressing (ii), 
and using only < and =). 

T is consistent, hence it has a &4-saturated model Dt. By 7.6, w.l.o.g. we 
can assume that Jt € B, Pt has universe w, B is admissible, and Ord(B) = 
a. M can be naturally identified with a subset a Cw. It is obvious that for 
every B < a, B € a“ but of course, a¢ a* CB. Hence a = Ord(a’). O 


Our two main applications of {-saturated structures will be given in the 
next two sections. 


8. The covering theorem 


Let A be a countable admissible set and let ® be a conjunctive game 
sentence in A over a language LE A. Recall the approximations ®, of 
®; we will be interested in the ®, for B <Ord(A). Each such @, is a 
sentence of L, and (®,: 8B <a) is an A-recursive function. @ € A au- 
tomatically implies that w © A; for admissible sets with ordinal w we 
should consider A-recursive conjunctive game sentences, cf. end of 
Section 5. The following theorem remains true, with essentially the same 
proof (based on earlier remarks) for the more general notion. 


8.1. VAUGHT’S COVERING THEOREM (VAUGHT [1973]]). If g is any sentence in 
L4 forsome L'D L, then ®F ¢ implies that ®, F ¢ for some B < Ord(A). 


PRooF (Harnik [1974]). Using the heavy guns of the preceding sections, we 
can give a very short proof. Assuming that the conclusion is false, consider 
the consistent X,-theory {®,: B < Ord(A)}U{— 9}. By 7.3, the theory 
has a countable 2 ,-saturated model 2. By 7.6, we can assume that JtE B 
for some admissible B D A with Ord(B) = Ord(A). Now, apply 5.4(iv) for 
the admissible set B and the B-finite structure Dt. Since Doras) = Poraay = 
NAg<ornmay Pg, we have ME Po.as), hence by 5.4(iv) Me. We have 
reached a contradiction since WE —“g and @r’g. O 


This theorem is a fundamental theorem of logic. We will point out some 
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aspects of its importance below. We first state an immediate consequence 
of 8.1 and the game-form theorem 5.3. 


8.2. STRONG CraiG—LopEz-EscoBAR INTERPOLATION THEOREM. (i) Let 
yg — wW be a valia implication of sentences in L.,.. Then there is a sentence 0 
in L.,., @ Craig interpolant of ¢ and , such that 9 — 6, 0 — w are both valid 
and all the non-logical symbols in @ occur both in y and w. In fact, if ep € A, 
w & B, A, B are both admissible, A C B and Ord(A) = Ord(B), then 6 can 
be found in A. 

(ii) For any Si-sentence AR@(R) over L.,.. there is an N,-sequence 
(®, :a <N,) of Lu,.-sentences such that whenever 3S (S) is any other 3} 
sentence over L.,,.. such that there is no (countable) model of ARo naSy 
(ASW is “disjoint” from ARq), then there is a <®, such that AR|p KF ©, 
and ®, — 7 3S (®, “separates” AR ~ and 3S pp). In addition ®, is in the 
smallest admissible set containing IR ganda, {AR¢, a}* fora <N,. Also, 
for given ASW, the a above can be found in {AR ¢, 3S}. 


Proor. (ad (ii)) Our proof will give the slightly weaker result with 
{AR o,a,w}*, {AaRe,3Sv,o}' in the appropriate places. Using A- 
recursive Vaught sentences, we would get the full result. By the game-form 
theorem 5.3 there is a Vaught sentence ® in {AR¢,w)* such that ® and 
3R @ are equivalent for countable models. Consider the approximations 
®, for a< &;. Given 3Sy(S) “disjoint” from 3R ¢, we have ® F’—9(S), 
hence by the covering theorem there is ®, F 4 ¢(S), i.e., &, F AAS W(S). 
Since PF ®,, also IRGE'®,, hence IRek ®, (why?). The covering 
theorem in fact gives a in any admissible set containing @, 3S%(S) and 
thus, in any admissible set containing IR¢y, w and 3Sy(S). 

(ad (i)) Let Lo ve the set of the common non-logical symbols of ¢ and y, 
let R and S be the sets of non-logical symbols outside Lo in g and 
in yw, respectively. Then Fy—w is equivalent to saying that 
ARgrF-3S(—w). Applying (ii) we get the Lo-sentence 6 = ®, in 
{AR 9, a}* with a € {AR ¢, IS (4 b)}* such that Fe > OA0—>y. With A 
and B as in the assumption, clearly 6 € A. 


Let us restate some direct consequences of 8.2 for emphasis. 


8.3. CoROLLARY (Suslin). Disjoint analytic sets (of the Cantor-space) can be 
separated by a Borel set. In fact, given an analytic set A, there are ®, Borel 
sets B., a@<Ni, such that A =(\.ex,Ba and whenever A' is another 
analytic set disjoint from A, then some B, separates A from A 
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8.4. CoROLLARY (Barwise Interpolation Theorem). Two disjoint X!-over- 
L,-classes of structures can be separated by an L,-elementary class. Hence, 
Al-over-L,- and L,-elementary-classes coincide. 


Next, we formulate a second order generalization of 6.4(iii). 


8.5. Corotiary. Let I= (| Me], R1,..., R:) be a countably infinite structure 
and let X be a class of n-any relations on |M|. Let A=HYPm and 
Ao = Loraay (the smallest pure admissible set with the same ordinal as A). 
Then the following are equivalent: 

(i) X is Aj-over-L, on M. 

(ii) X is Aj(-over-L.,) on DP. 

(iii) X is L4-elementary on WM. 

(iv) X is L.,-elementary on M. 


Remark. Notice that the implication (i) > (iii) is a consequence of the 
Barwise interpolation theorem. On the other hand, by passing to X = 
{{n}:n © B} from a subset B of w, the equivalence (i) © (iii) is a 
generalization of the second part of 6.5. This is how the Kleene Theorem 
(cf. the Remark after 6.6’), the Barwise interpolation theorem, and the 
Suslin separation theorem are related in the general result 8.2. 


PROOF OF Coro.tary 8.5. First one establishes that Y is L,-elementary on 
M > Y isTI; on M. We omit the proof that is similar to that of 3.11. Then 
one sees that the implication (iii) — (ii) follows. ‘‘(@ii) > (iii) is contained in 
8.4, but this is how we show the more refined ‘‘(ii)— (iv). Assume that X 
is A; on M, SEX S(M,S)F AT GAS, T:) & (M, S\N VT2 GAS, T2), 
$1, $2 are finitary using some parameters p in ||. Now, let go be an 
A-finite sentence “‘describing” Yt in a language containing a constant for 
each a E |M|; go is 


A Diag (M)aWx V x='a! 
ae/P 


We have ¢,(S, T:)F go— ¢2(S, T2) (why?). Hence, there is a Craig interpo- 
lant 6 € Ao (see 8.2(i)!) such that 6 is over the language {R,,..., R, S,p} 
and such that F ¢:— @ and F 6 A go— ¢2. Hence, 0(S, p) will define X on 
M (exercise). O 


The Lopez-Escobar generalization of Craig’s interpolation theorem, 
8.2(i) without the part “In fact, ...”” was one of the first important results in 
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infinitary logic. Lopez-Escobar’s proof was through a complete Gentzen- 
type infinitary formal system, quite similarly to Craig’s original proof. 
Barwise established the same result for admissible fragments of L.,.. A 
natural proof of these results can be given using consistency properties, cf. 
KEISLER [1971]. The full strength of 8.2, however, does not seem to result 
from these proofs. REssayRE [1976] established the full 8.2 without using 
game-formulas (in fact, independently of VAuGHT [1973], but later) but 
Vaught’s proof looks better to us. 

The next corollary is a strong version of Beth’s definability theorem (cf. 
e.g. Chapter A.2). 


8.6. BETH's THEOREM. Lét o(P) be a Li-sentence over L(P)4 or more 
generally, let o(P) be AR A ¥(R, P) for some X, set V of sentences in 
L(R, P),. Suppose that on any L-structure Wt there is at most one relation P 
such that. (It,P)Fo. Then there is an La-formula p(x) such that 
oF Vx (Px @ ¢(x)). 


Proor Let o = SRW(R, P). Let Ri, R2 be distinct copies of R, P:, P2 of P, 
and a some individual constants. Then the hypothesis implies that the 
implication [#(R,, P:) Pia] > [b(R2, P2)— P2a] is logically valid. The 
two sides of the implication have only L and a in common. By the 
interpolation theorem, there is an La(a) formula ¢(a) that is a Craig 
interpolant. Then g(x) is the required L,-definition of P. Exercise: 
Complete the proof for the more general kind of 0. O 


In the proof of the next theorem, we will utilize the exact form of 
Vaught’s approximations. 


8.7. THEOREM. Let ~ be a sentence of La, L containing a binary predicate 
symbol <. Suppose that for each B < Ord(A), there is a model M of p such 
that <™ is a linear ordering containing a subordering having order type B. 
Then ¢ has a model MM such that <™ contains a copy of the ordering of the 
rational numbers. 


8.8. CoroLLary. If each model of ¢ is well-ordered, then there is an ordinal 
BEA such that each model of ¢ has order type = B. 


8.9. CoROLLARY. The class of countable well-orderings is not a Xj-over- 
L.,0-class af countable structures. 


cH. A.7, §8} THE COVERING THEOREM 275 


PROOF OF THEOREM 8.7. We will assume that in each model we talk about, 
< automatically denotes a linear ordering (why can we do this?). We start 
by writing down a purely existential ““Vaught sentence”’ whose models are 
precisely the extensions of Q, the ordering of the rational numbers. Let 
{r,.:1<n <w} be a repetition-free enumeration of |Q|. Let 6(x1,..., x) 
denote the conjunction 


A{xi<xjin<r, 1Si, json}. 


Then the sentence ® is 


Bx,5x2°++dx,-°: AN bn (%1,...,Xn)5 
® clearly serves our purposes. By contraposition, assume that ¢ does not 
have a model containing a copy of Q. Then we have ®& — 4. Note that 
®CA whenever w € A, and @ is A-recursive always. By the covering 
theorem, there is ao><Ord(A) such that ®, - “¢. By analyzing the 
meaning of ®.,,, we will obtain the desired conclusion, as follows. 

Let us say that a linear ordering has length = a if it has a subordering of 
type a. Letr; <---<r, be the proper ordering of the first n rationals. We 
claim pat for any ordinal a, and any ordering Dt = (|M], <™), if ai,<™ 

-<™@q_ and each of the open intervals (,a;,), (a, ), -- eee 
(a,.) are of length 23%, then PE D2 [ai,...,a,]. For n=0, the 
hypothesis is understood to mean that Mt has length =3* and the 
conclusion is Jt ®,. The proof is an induction on a. For a = 0, the claim 
is easily seen. For a = B+1, ®, (x1,..., Xn) = DXn+1 Pp (X1,.--, Xnsi). We 
now look at which one of the intervals (, r,), (Tiy) Tig)s - «+> (Tin-a» Vin > Tins ) Tn 
falls into. Suppose €.g. fi, <tm+i< Ty... Since 3°*'>3° +1+3°, there is 
@ = Ans; such that a, <<™a<™a;,,, and both (a,,a) and (a, a,,,) are of 
length = 3°. Hence, by the induction hypothesis, YF ®, [aj,..., Any An+i}, 
hence We @,,,[a:,...,@,]. For a@ a limit ordinal, the induction step is 
trivial. 

The claim and ®,,-—¢ imply that whenever PtF ¢, then Yt does not 
have length =3%. Since, as it is easy to see, a» € A implies that 3°€ A, 
this proves the theorem. [J 


Cf. KEIsLeR [1971] for an alternative proof, using consistency properties. 
In BaARwisE [1975] there is a proof that is a direct application of the 
>-compactness theorem. 

We presented the above proof to illustrate a typical use of Vaught’s 
covering theorem. In Makkai [1973] and HARNIk and MAKKAI [1975] there 
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are other applications of the covering theorem where the form and/or the 
meaning of the approximations play a role. Thus, the importance of the 
covering theorem is not contained in its main application, 8.2, but rather, it 
is important to use the exact form of the approximations in many cases. 
This is a fine example for in what sense a logical generalization (the 
covering theorem) supersedes the original descriptive set-theoretical result 
(Suslin’s theorem). 

In VAUGHT [1973], there are further results whose proofs are based on 
game-sentences and their approximations. We state two of them. 


8.10. REDUCTION THEOREM FOR L,,.. (VAUGHT [1973]). For any two IIi- 
over-L.,,.-classes K,, K, of countably infinite structures, there are II}-over- 
Li,w-classes K*,K% such that KtC Ki, K$C K, KiN Kt=®@ and 
KijUK3=K,UK. 


8.11. THEOREM (VAUGHT [1973]). Any %j-over-L.,.- (and actually, %- 
over-L.,,.) class of countable structures is the union of &,L.,.-elementary 


classes. 


It turns out that Scott’s isomorphism theorem (cf. Scotr [1965]) is a 
direct consequence of 8.11. On the other hand, both results 8.10, 8.11 
specialize to classical results of descriptive set-theory, e.g. the first to the 
reduction property of complement-analytic sets. 

REssAYRE [1976] contains further interesting applications of model 
theory to descriptive set-theory. Ressayre gives a proof of a charac- 
terization of [Ii sets of subsets of w using admissible sets, the 
Spector-Gandy theorem. Let Pt = (| Mt], Ri,..., R,) be a countable struc- 
ture of a finite similarily type. 


8.12. SpecroR-GANDY THEOREM. A set X of subsets of | Q| is M1! iff there is 
a %-formula a(x, y) and some parameters p in | M|U{M, Ri, ..., Ri} such 
that 


X ={B C w!HYPas)K o[B, p}}. 


Notice that 8.12 is a second-order generalization of 6.4(ii). Its proof is 
very similar to that of 6.4. Ressayre uses 8.12 (for t= (w,0,1, +,-)) to 
give a ‘‘model-theoretic” proof of the Novikov-Kondo uniformization 


theorem (cf. Chapter C.8). 
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9. The perfect subset theorem 


A classical theorem of descriptive set-theory says that any uncountable 
analytic set, say of the Cantor discontinuum, contains a perfect subset, and 
hence has power 2” (cf. Chapter C.8). 

On the other hand, a theorem in Harrison [1966] (cf. also MANSFIELD 
[1970]) says that, if X is a countable light-face (for simplicity) 2} set of 
subsets of w, then each element of X is hyperarithmetic. We will prove a 
model-theoretic theorem that has both theorems as consequences. It also 
simultaneously generalizes the definability theorem of Reyes [1970] (cf. 
also CHANG [1964], Maxkal [1964] and CHANG and KEISLER [1973]) and a 
theorem of KugkEr [1968]. 

Let A be a countable admissible set. Let 6 = {6,(v, u,):i € I} be an 
A-finite family of L-formulas and suppose that an L(P)-structure (2, P) 
satisfies the disjunction 


V du, Vv (P(v) <> 6.(v, u:)), 


denoted shortly by 55. Then P is defined by one of the formulas 0,(v, b;) 
with some parameters b; in | I|. Now, if for a sentence o over L(P) (in any 
“‘logic’’), we know that o F 5,, then for any L-structure 9% and any P on 
|M| such that (M, P) o, the above conclusion holds. In particular, for a 
fixed countable L-structure IM, {P: (Mt, P) = o} will be a countable set. This 
shows the implication (iii) > (i) in the following theorem. 


9.1. THEOREM. For a %}-sentence o of form AR A WV(R, P), with a &, set 
W(R, P) of sentences of L(R, P), (in particular, o can be a 3} sentence over 
L(P)a, the following are equivalent. 
(i) For every countable L-structure I, {P: (MM, P)- o} is countable. 
(ii) For every countable L-structure Wt, {P: (Mt, P)& o} does not contain 
a perfect subset (in the Cantor space 2", if P is n-ary). 
(iti) For some A-finite 0 as above, 7 F do. 


9.2. COROLLARY (HARRISON [1966]). For a %j set X of subsets of w, if X does 
not contain a perfect subset, every element of X is hyperarithmetic, in fact, X 
is a subset of an w'-finite set. 


PRooF (based on 9.1). Let o(P) be the %j-sentence defining X on 
M = (w,0,1,+,-). Apply (ii) > (iii) for A = w* = L.,«. We obtain that for 
some A-finite 6 = {6,: i € I}, 
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XC Y=a{{a Ew: M = 6; [a, b,]}: i E [ b, E “wh. 
Y CA, in fact Y is an A-finite set (why?). By 6.5, the assertion follows. O 


For a first-order sentence o of L(P)a, 9.1 was proved in MAKKAlI [1973b]. 
Subsequently Barwise noticed the present more general (and more useful) 
form, cf. BARWISE [1975]. The proof in MAKKAI [1973b] as well as the one in 
BarwisE [1975] uses consistency properties (‘consistency machines’’, in 
Barwise’s words). The proof via 9.3 given below, using %-saturated 
structures, is due to HARNIK [1974]. The next theorem is a direct generaliza- 
tion of Harrison’s theorem. 


9.3. THEOREM (HARNIK [1974]). Let o be a S!-sentence IR A ¥(R, P), 
with W(R, P) a X4 set of sentences in L(R, P), (in particular, 0 can be a 
Di-sentence over L(P),). Let It be a countable X4-saturated L-structure 
and assume that there is PoC |Dt| such that (MN, Po)Fo and Py is not 
L,-elementary on IN. Then {P:(M, P)-o} contains a perfect subset. 


Proor. Let o=J3R A ¥(R, P). Let @(P) be the set 
{1 (Su Vv (P(v) = O(v, u)): 8 E La}. 


Clearly, O(P) is an A-recursive set. The hypothesis implies that (Mt, Po) is 
a model of {7} U @(P). By induction, we are going to define finite sets 
MoCM.C-:-CM,C--:,M, C|M}, and XZ, sets T,(R, P) of sentences of 
L(R, P)4 (|2]) (using finitely many constants denoting elements in ||) 
for s€°2 such that (i)}-(iv) below hold: 

(i) If th(s)=n, then T,(R, P)CL(R,P)s(M,); if sCs', then WC 
T, (R, P)C T,.(R, P). 

(ii) If s and s' are incomparable, then T,(R, P)UT,(R, P)F 
P(a)* P'(a) for some constant a € M,, n = min (lh(s), lh (s’)). 

The definitions will be devised to ensure that for all 7 € “2, (MW, a)aejm is 
a model of AR SP A (U,<., Ton (R, P)). This will produce a perfect set of 
P’s such that (Xt, P)Eo. The essential induction hypothesis for the 
construction is 

(iii), (DM, a)aem,F SPAR A(T. (R, P)U O(P)) for all s E "2. (This will 
ensure, among other things, that U,-., T,),(R, P) is finitely consistent with 
Diag(2).) To formulate one more condition, let Ju 6o(u), du d,(u),... 
enumerate all valid sentences in L(R, P), (| Dt|) of the form 3u6(u), and 
let V Wo, V W,... enumerate all valid disjunctions in the same logic. The 
last useful property of our construction will be: 
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(iv) If Ih(s)=n+1, then T,(R,P)F65,(a) for some a €|M| and 
T. (R, P)E wd, for some ¥, € V,. 

Suppose we have constructed all the T,, s € “2, satisfying (i}-(iv). Let 7 
be a fixed element of “2, T, = a U,<u Tain. Ty Satisfies the conditions of 4.3 
(especially by (iv)), with C = | ®t]. Hence, by 4.3, there is a model Mi, of T,, 
all of whose elements are denoted by some a € |]. Since T,, is finitely 
consistent with Diag (Xt) and T,, is complete (cf. 4.3), Diag (Mt) CT,. It 
follows that a» a™> is an isomorphism of Yt onto me: | L, hence we can 
assume that M/,| (L U {P}) is of the form (M, P,, ). It follows from (i) and (ii) 
that {P,: 7 €°2} is a perfect subset of 2, proving the theorem. What is 
left is carrying out the construction of the T,. 

Define M,=@ and Ty(R, P) = ¥(R, P). The essential condition (iii)o is 
satisfied by our hypotheses. Assume next that T,(R, P) has been defined 
for all s with Ih(s) = n (and so, M, has been defined as well). To define T;,o, 
and T,., we prove first: 


9.4. CLaim. There are Po, P,C|Pt|, Po#P,, such that (MW, a)sem, F 
3R A(T, (R, P,) U O(P,)) for i =0,1. 


Otherwise, if this were false, we would have 
(M, Q)acm, F SP, SP, JRSR, AS 


where S is the %, theory of L(Ro,R:,Po,P:)4 defined by S= 
T, (Ro, Po) U @(Po) U T,(Ri, P:) U O(P,) U{ Po # P,}. Then, since M' = 

(WN, a)aem, iS L-saturated, by relation universality (cf. 7.4) there is an 
L(M, )-consequence “yf of S that fails to hold in 2’. Thus, Mt’ F w and 
{ws} U T, (Ro, Po) U O(Po) U T, (Ri, Po) U O(P,)E Po = P,, i.e., for every 
L(M,.)-model 3% of wg there is at most one P_ such that 
(M, P)E AR A(T, (R, P)U O(P)). So, by Beth’s theorem 8.6, there is an 
L(M,)-formula 6(v,a) such that {y}U T, (R, P)U O(P)EVv(P(v)o 
6(v,a)). But then {f}UT,(R,P)U@(P) is contradictory (since 
74 u Wb (P(v)< 0(v, w)) belongs to @(P)), contradicting Mt’ & w and the 
induction hypothesis (iii),. This proves 9.4. 

Having shown the Claim, for fixed s€"2 pick Po, P:C|M| such that 
Po#P, and (M,P)FIR A(T.(R,P)UO(P)) for i=0,1. Let a= 
a, & |M| be such that a € Py iff a¢ P,. Assume e.g. that a € Py and a¢ P,. 
Define Tyo = T, U{P(a)} and Ty = T, U{—P(a)}. Now, also fix 
i € {0,1}. To define T.,, pick R such that (M’, R, P,) T. (R, P) and take 
Ti) (R, P) = Ti) (R, P) U{5, (b), Ua} where bE |M| and ¥,E VW, are 
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chosen so that (M’, R, P:) 6, [b] a w,. This completes the construction of 
T, for each s'= s(i) of length n+1; notice that T, contains two more 
constants from ||, namely-a and b, in addition to those in T;. We let the 
finite set M,1C |Dt| contain M, and all the a and b involved in the finitely 
many T,. with Ih(s')= n+ 1. It is clear that our construction will meet the 


requirements (i)-(iv). 


PROOF OF THEOREM 9.1. We have to show the implication (ii) > (iii), i-e., 
(iii) > (ii). Assuming — (iii), the 2,-theory YU @(P) (@(P) taken 
from the proof of 9.3) is consistent, hence it has %,-saturated model 
(M, R, Po) (cf. 7.3). Then M is Z,-saturated (why?) and Pp is not La- 
elementary on YP. Since PE o, (ii) is a consequence of 9.3. O 
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Preface 


The aim of this paper is to give a survey of category-theoretic methods in 
logic, in particular in model theory and set theory. We organize the 
material by increasing richness of the “‘doctrines” involved. These doc- 
trines are categorical analogues of fragments of logical theories which have 
sufficient category-theoretic structure for their models to be described as 
functors. In particular, we deal with the doctrines of equational, cartesian, 
finitary coherent and infinitary coherent logic. Higher order logic and set 
theory are touched upon in Sections 6 and 7. 

Certain themes run through the various doctrines: invariance of presen- 
tation, relative interpretation, completeness, conceptual completeness and 
generic structures (as well as operations performed on these). 

Fundamental to the functorial approach to model theory is the introduc- 
tion of universes (categories) other than sets where models live. This makes 
possible the notion of generic structure considered here. 


1. Algebraic theories 


The introduction of the categorical notion of algebraic theory led to a 
systematic theory of relative interpretations of one equational theory into 
another, as well as a theory about the categories (or varieties) of algebras 
for these, and their relationship. This progress springs from having a 
presentation-invariant notion of equational (or algebraic) theory. 


1.1. HistoRICAL REMARK. It is clear that the two standard ways of present- 
ing the notion of abelian group: (i) a set equipped with a binary operation 
‘plus’, a unary operation ‘minus’ and a nullary operation (constant) 0 (with 
certain equations satisfied) — or, (ii) a set equipped with a binary ‘minus’ 
and a nullary ‘0’ (with certain equations satisfied) — only differ in what 
operations are considered primitive and which derived. The “‘equational 
theories” are the same, or the ‘‘totality” of operations (primitive and 
derived) can be identified in the two cases, and in such a way that the 
substitution process of operations into other operations are preserved 
under the identification. To understand this phenomenon in precise terms 
as an isomorphism between two mathematical structures ‘“‘consisting of” 
operations, and equipped with some kind of substitution structure, led to 
invention of concepts like ‘“‘Clone of operations” (P. Hall, see CoHN 
[1965]), or “‘compositeur’’ (Lazarp [1955]), but these were somewhat 
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unmanageable notions. The right way of conceiving the totality of opera- 
tions for an equational theory was found by Lawvere_ [1963] who realized 
that substitution should be viewed as the composition of arrows in a certain 
kind of category: 


1.2. DeFinition. An algebraic theory T is a category whose objects are the 
natural numbers 0,1,2,... and which for each n is equipped with an 
n-tuple of maps 


proj :n— 1, i=1l,...,H 


making n into the n-fold categorical product of 1, n = 1". 


1.3. ExampLe. We define an algebraic theory T with 


hom(n, m)= m-tuples of polynomials 
in X,,..., X,, with integral coefficients 


with substitution of polynomials as composition. (The proj, is just Xi, 
considered as a polynomial in the variables X,,..., X,, 1<i <n.) This T is 
called the theory of commutative rings. For further examples, see 1.14. 


1.4. For every algebraic theory, a map n > m in it can be described by an 
m-tuple of maps n —> 1 because m is an m-fold product of 1; so the maps 
n— 1 play a special role, they are called the n-ary operations of the theory. 
In the example above, an n-ary operation is a polynomial in n variables, 
which is also precisely what an n-ary (derived) operation is in any 
presentation of the equational theory of commutative rings. 

Similarly, there exist algebraic theories for all other equational theories 
like the theory of groups, lattices, Lie-algebras, ...; in these, the n-ary 
operations are (composite) operations in n variables (that is, are n-ary 
terms), with two such identified if their equality follows from the axioms 
(basic equations) of the theory. This description indicates how one 
constructs an algebraic theory (in the sense of Definition 1.2) for each 
equational theory. This might be called the syntactical way of constructing 
algebraic theories. 

The semantics of algebraic theories, that is, the process which leads from 
the theory to its category of models, can now be explained in the following 
way. 

1.5. Dertnition. Let T be an algebraic theory, and € a category with finite 
products. The category of T-algebras or T-models in € is the full subcateg- 
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ory Alg(T, @) of the functor category [T, @] whose objects are the finite 
product preserving functors. 


1.6. REMARK. (i) When considering Alg(T, @), we call @ the value cate- 
gory. If €@ = (Sets), we write Alg(T) for Alg(T, @). 

(ii) We know the value of an algebra A : T— @ at an n ET if we know 
A(1), since A(n) = A(1)", A being product preserving and n being 1". To 
describe A on the morphisms of T, say, a morphism (‘‘operation’’) 
w:2-— 1, amounts to describing a map 

A(2)= A(1)X A(1I)—> A(I), 
that is, a binary operation on the object A(1). So in case @ = (Sets), A 
interprets abstract binary operations w :2—> 1 as actual binary operations 
on a single set A(1), called the underlying set of the algebra A. 

(iii) With these remarks, it is not hard to understand why, in case, say, 
where T is the algebraic theory of commutative rings, we have an 
equivalence of categories Alg(T, (Sets)) = Category of commutative rings. 
One should note that the naturality requirement on transformations 
7t:A—B precisely amounts to requiring that 7,:A(1)—> B(1) is a 
homomorphism with respect to all the operations of the theory. If, for 
instance, w:2—1 is a binary operation in T, both things amount to 
commutativity of the diagram 


A(1)x A(1)= A(2)—* A(1) 

TXT | |» \" 

B(1) x B(1)= B(2)——> B(1) 
(the left-hand square being also commutative because of naturality). 
1.7. Relative interpretation 


As mentioned, relative interpretations of one equational theory into 
another can now be dealt with very rationally: 


1.8. DEFINITION. A morphism of algebraic theories is a functor f:T—>T' 
with f(n) =n and with f(proj;) = proj; for all the specified projections. 


1.9. ExampLe. We give two trivial and one non-trivial example. 
(i) Let T be the theory of abelian groups, T’ the theory of commutative 
rings. There is an obvious inclusion T > T’, in fact T(n, 1) may be viewed as 
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the subset of T’(n,1) (= set of polynomials in n variables) consisting of 
homogeneous degree 1 polynomials. 

(ii) Let T be the theory of groups, T’ the theory of abelian groups. There 
is an obvious ‘surjective’ T > T’ obtained by identifying two abstract group 
operations if their equality follows from the commutative law. Studying the 
quotients of a theory T in fact is equivalent to studying the subvarieties of 
the variety it defines, to express things in Birkhoff terms (Coun [1965]). 

(iii) Let T be the theory of Lie-rings, T’ the theory of (associative but not 
necessarily commutative) rings. There is a morphism T-—>T’ which sends 
the binary [-,-] in T into the commutator operation w in T’, w(x, y)= 


X-y-yerx. 


1.10. If f: TT’ is a morphism of algebraic theories, one immediately gets 
a functor ‘composing with f” 


f°: Alg(T', )— Alg(T, @). 


This feature is common to all kinds of functorial semantics. In the case of 
algebraic theories (next section), one can say more, namely (LAWVERE 
[1963]): 


1.11. THEOREM. If the value category € is (Sets) (or any other locally 
presentable category, GABRIEL and ULMER [1971]), the functor f” has a left 
adjoint. 


Proor. For a proof in the (Sets) case we refer to ScHuBERT [1970]. O 


1.12. Free algebras 

The existence of such can be considered as a corollary of Theorem 1.11, 
by taking the T in 1.10 to be the algebraic theory whose only n-ary 
Operations are the projections, and whose category of algebras is just 
(Sets). The functor f’ can be identified with A » A(1), which is a functor 
Alg(T’)— (Sets). It has a left adjoint F:(Sets)— Alg(T’), by 1.11. An 
algebra of form F(M) (M a set) is called the free algebra on the set M of 
generators. It now turns out that any algebraic theory can be reconstructed 
from the free algebras: 


1.13. THEOREM. Let T be an algebraic theory. Then T is equivalent to the 
dual of the full subcategory of Alg(T) defined by the free algebras F(n) 
(n =0,1,2,...). 
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Proor. The functor hom;(n, -): T— (Sets) preserves products and is thus 
a T-algebra. For any other T-algebra A : T— (Sets), we have (denoting by 
Hom the hom-functor of [T, (Sets)] and thus of Alg(T)): 


Hom(hom;(n, -), A) = A(n) = [] A(1)= hom(n, A (1)), 


the first isomorphism by Yoneda’s lemma. Thus homy;(n, -) serves as F(n). 
The second conclusion is now immediate, again by Yoneda. O 


1.14. ExampLe. The following is an example of an algebraic theory which 
is neither obtained by a presentation, nor by ‘“‘semantic” means as in 1.13, 
but rather by a construction performed in the category of algebraic 
theories. Let T) be the theory whose n-ary operations are formal power 
series in n variables and without constant term. They can be substituted 
into each other without any convergence trouble, since the constant term is 
missing. Formally, tT) can be obtained as an inverse limit formation in the 
category of algebraic theories, lim To,n, Where To, is the algebraic theory 
for commutative rings without unit and satisfying Vx: x" = 0. The algebras 
for T, are not known, neither is a syntactic presentation. 


1.15. Varying the value category 

We can consider Alg(T, @) for any category @ with finite products; in 
particular when @ = (Alg(T’))*. In this case one gets the notion of 
“‘T-coalgebra in the category of T’-algebras’’, see WRAITH [1969] and FREYD 
[1966], which covers structures like Hopf algebras which cannot be 
described in terms of classical universal algebra (not even in finitary 
first-order logic). 


1.16. The generic algebra 
It is clear that if p: €— @’ is a product preserving functor, we get a 
functor ‘transport along p’ 


Alg(T, €)— Alg(T, 8’) 


by composing with p. We give a trivial example of this, which, however, is a 
case of an important principle which we shall meet in Sections 2, 4 and 5, 
the principle of generic structure. For a given algebraic theory T, we may 
consider Alg(T, T), which makes sense since T has finite products. Among 
the T-algebras in T, we have the identity functor id; : T—>T. It is a generic 
T-algebra in the sense that any other T-algebra A : T—> in any category @ 
(with finite products) can be obtained by transport of the algebra id, along 
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the product preserving functor A (clearly, since A cid; = A). (The exam- 
ple 1.9(iii) can be seen in this light: we equip (the underlying object 1 of) 
the generic associative T’-algebra (associative ring) with the structure of a 
Lie ring.) 

However, the category T in which the generic T-algebra lives, is rather 
poor in categorical structure. In the next section, we shall encounter a 
richer category FP(T)° in which also lives a generic T-algebra (this time for 
cartesian (= finite limit preserving) functors) — and similarly for each of 
the stronger doctrines we meet later. The richer contexts make certain 
constructions on T-algebras possible ‘once and for all’, namely as construc- 
tions on the generic T-algebra. 


1.17. Limitations 

It is possible to characterize in categorical terms categories which come 
about as Alg(T) for some algebraic theory T, see LAwverE [1965a]. Some 
that do not are the category Cat of small categories, the category of 
algebraic theories, or the category of partially ordered sets. These can be 
dealt with in terms of functorial semantics using a somewhat richer notion 
of ‘‘theory”’: a small cartesian category. This kind of functorial semantics is 
what GABRIEL and ULMer [1971] is about. We shall motivate cartesian 
theories by an example of a relative interpretation of one theory into 
another which cannot be described in the framework of algebraic theories. : 


2. Cartesian theories 


2.1. EXampPLe (Descartes, roughly). Consider the functor ‘‘circle”’ 
S': (Commutative rings) — (Sets) 
which to a ring A associates the set S'(A), 
SA) = {(x, y)€ A?| x?+ y?= 1}. 
This set even carries an abelian group structure defined as follows: 
(xy) (CL y)=(erx’—yry,xcy’ty x’) 
(Gauss). Denoting by T and T’ the algebraic theories of commutative rings 
and abelian groups, respectively, we thus have a functor 


S': Alg(T)— Alg(T’). (2.1) 


If @ is a cartesian category (=having finite inverse limits), it has 


290 KOCK, REYES/CATEGORICAL LOGIC [cH. A.8, §2 


equalizers, and if A :T— @ is a commutative ring object, we can use the 
equalizer of 


A(Xj+ X3) and A(1) (2.2) 


(both the arguments of A being viewed as morphisms 2— 1 in T) to define 
S'. In this way, the context of (2.1) is extended so as to define a functor 


S': Alg(T, )Alg(T, 8) 


for any cartesian category @. 

Unlike the functor (Associative algebras) — (Lie algebras) considered in 
1.9(iii), the functor S' is not describable by performing the S'-construction 
once and for all on the ‘generic commutative ring’ in T, because T does not 
possess the requisite equalizer. This motivates looking for a notion of 
cartesian theory where such constructions can be performed, and inves- 
tigating the functorial semantics of such: 


2.2. DEFINITION. A cartesian theory C is a small category with finite inverse 
limits. A model M for C with values in a category @ with finite inverse 
limits is a functor M:C— @ which preserves these limits; a morphism of 
models, M— M is any natural transformation from M to M. 


2.3. REMARK. Thus we have defined a category of models with values in @, 
frequently denoted Lex(C, @) since its objects are the left exact (= finite 
inverse limit preserving = cartesian) functors, It depends functorially on 
both C and @: any left exact functor €— @ between the value categories 
defines a ‘transport’ functor between the model categories, Lex(C, @) 
— Lex(C, @ ). Any left exact functor C —C between cartesian theories 
(‘relative interpretation’) induces a functor Lex(C, €)— Lex(C, @) be- 
tween the model categories. As in the case of algebraic theories, such 
functors have left adjoints provided @ = (Sets) (or any other locally 
presentable category). The reader may find a thorough treatment of 
categorical properties of categories of this form in GABRIEL and ULMER 
[1971] (in their notation, Lex(C, @) = St.(C, @) with a = ®.). Among the 
theorems in their book is one which we would term a (strong form) of 
conceptual completeness of cartesian theories: 


2.4. THEOREM. Suppose C and C’ are cartesian theories such that 
Lex(C, (Sets)) = Lex(C’, (Sets)). Then C=C’. 


Proor. The proof follows from the fact that C can be reconstructed from 
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Lex(C, (Sets)) as the dual of the subcategory of (abstractly) finitely pre- 
sentable objects (GABRIEL and ULMER [1971], §7, in particular 7.10). 0 


2.5. Comparison with algebraic theories 

The clue to the proof that the doctrine of cartesian theories comprises 
that of algebraic theories also lies in the notion of finitely presented object. 
In a category of form Alg(T), the finitely presented objects form precisely 
the smallest subcategory containing all objects of form F(n), and stable 
under finite colimits. Every finitely presented algebra A can be described 
as a coequalizer in Alg(T), 


F(n')3 F(n)—> A, 


(and a finite colimit of finitely presented algebras is finitely presented). We 
denote this subcategory FP(T). If a is a functor from the full subcategory of 
Alg(T) consisting of the objects of form F(n) (n = 0,1,...) (and which we 
may identify with T°’, by 1.13) into a category with finite colimits, then it 
extends, because of the above coequalizer, in at most one way to a finite 
colimit preserving functor on FP(T), and a necessary condition that it does, 
is, of course, that a preserves finite coproducts (since F(n)+ F(m) = 
F(n + m)). This necessary condition is also sufficient: we formulate this in 
the following theorem, where we, however, first change the variance. 


2.6. THEOREM. Let A :T— @ be a finite product preserving functor from an 
algebraic theory into a cartesian category (so A is an algebra for T). Then the 
following diagram can be filled in with a left exact functor A (which is 
uniquely determined up to isomorphism ): 


T ———> FP(T)” 
z 


(The top functor being n# F(n) as studied also in 1.13.) 


Proor. For the case @ = (Sets), it follows because Alg(T) is locally finitely 
presentable, thus can be reconstructed as the category of set-valued left 
exact funtors on the dual of the category of finitely presented objects in it 
(GABRIEL and ULMeEr [1971], 7.9). It now easily follows also for @ a functor 
category [C°?, (Sets)]. Further, if the values of A are representable 
functors the values of A will be finite inverse limits of representables, and 
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will themselves be representable, provided C has finite inverse limits. This 
proves the theorem for €=C. O 


2.7. REMARK. The way to read the theorem (provided the value category 
has finite inverse limits) is: Every category of algebras for an algebraic 
theory T can be considered the category of models for a cartesian theory 
P = FP(T)”. The identification identifies A with A in the above diagram. 


2.8. Scope of the doctrine 
As mentioned in 1.17, there are notions, like ‘partially ordered set’, 


which can be expressed by functorial semantics of a cartesian theory but 
not of an algebraic theory. (The appropriate cartesian theory C is in this 
case the dual of the category of finite partially ordered sets). How this 
works in detail can be seen in GABRIEL and ULmer [1971], 8.2b, or in the 
work of the Ehresmann school, where one may find many examples of 
cartesian theories or the closely related ‘esquisses’, see BASTIANI and 


EHRESMANN [1972]. 


2.9. Syntactic characterization of the doctrine 

A sufficient condition for a first order theory to have its category of 
models (where maps are those that preserve the primitive predicates and 
operations) of the form Lex(C, (Sets)) for a suitable cartesian theory C, is 
that the axioms of the theory are simple Horn sentences, that is, of form 


Vx (p(x) > Aly d(x, y)) 


(x and y are tuples of variables) with g and y% conjunctions of atomic 
formulae — see KEANE [1975]. The theory may be allowed to be many 
sorted; this corresponds to the fact that a cartesian theory C unlike an 
algebraic theory T does not have a preferred ground object. — There is a 
sense in which Keane’s condition also goes the other way, but it is not clear 


to us. 


2.10. ExAmpPLeE (continuation of 2.1). The functor (2.1) is not induced by a 
morphism of algebraic theories. However, if we take the corresponding 
cartesian theories, it is. Equivalently, out of the generic ring object in the 
cartesian theory C of commutative rings, we can construct the circle object 
(and its group structure), once and for all, that is, we have a generic circle. If 
A' denotes the generic ring object in C, the generic S' is constructed as the 
equalizer in C 
S'> A’?3A' 
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where the two displayed maps from A’? (= A'X A') to A’ are the ring 
operations Xj+ X3 and 1, applied on the ring object A’. Now, recalling 
that C = (finitely presented commutative rings)”, and A'= F(1), A’= 
F(2), we see that S' is the finitely presented commutative ring which is the 
coequalizer 


S' —Z[X,, X:] = F(2)=Z[X] = F(1) 


(the two parallel maps again being given by X¥ X7+ X3 and X11, 
respectively). So 


S'=2[X,, X2]/(Xi+ X3- 1). 


The group structure.on S', when viewed in C, is a co-group structure (1.15) 
when viewed in the aual more familiar category, the category of finitely 
presented commutative rings. As such, it is a map S'> S'@S' (which 
cannot be described in first-order terms). — This way of thinking on the 
circle, and other ‘algebraic subsets’ or ‘affine schemes’ in affine space A ", is 
familiar in modern algebraic geometry, see e.g. DEMAZURE and GABRIEL 
[1970]. 


3. Elementary doctrines: quantifiers as adjoints 


3.1. Once equational and cartesian logic have been set in a categorical 
context, it is natural to search for a categorical description of the logic of 
quantifiers. The basic observation, due to LAwverRE [1965b], [1969] and 
[1970], is that existential and universal quantification can be seen as left 
and right adjoints, respectively, of substitution. Lawvere used this descrip- 
tion as basis for arriving at a functorial semantics of first-order logic, first in 
the context of ‘elementary theories’, see LAwverE [1965b], further 
developed by VoLGEr [1975], and later in the context of ‘hyperdoctrines’ 
and ‘elementary existential doctrines’, LAwveRE [1970], p. 6. We shall 
discuss the latter. 


3.2. DEFINITION. An elementary existential doctrine (eed) is given by a base 
category T with finite products (whose objects and morphisms one should 
think of as the types and terms, respectively), and for each object (type) n, 
there is given a category B(n), the ‘‘category of attributes’’ of that type. 
(For applications in logic, the attribute category is usually, but not always, 
just a partially ordered set, the order relation being thought of as 
entailment.) The basic structure required is now that B(n) depends 
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contravariantly functorial on n, that is, for f : n> m a morphism in T, there 
is given a functor (= order preserving map) B(f): B(m)— B(n), thought 
of as substitution along f, and having a left adjoint 4,, “existential 
quantification along f”’. 


3.3. REMARK. In Lawvere’s original treatment of eed’s, B(n) is assumed to 
be a cartesian closed category, but there are as many variants of the notion 
as there are fragments of first-order language (roughly). 


3.4. ExampLe. Any category @ with finite inverse limits and good direct 
images defines an eed P(@) with T= @, B(n)=subobject semilattice of 
n€ @, and B(f) = inverse image formation (pull-back) along f. The B(n)’s 
will have, and the B(f)’s preserve, more or less lattice theoretic properties, 
depending on the exactness properties of @, leading to the variants of the 
doctrine notion. We shall discuss the applicability of eed’s for functorial 
semantics by considering the boolean case, where we have an (abstract) eed 
where each B(n) is a boolean algebra, and each B(f) a boolean 
homomorphism, and by considering only (Sets) as value category; note that 
P((Sets)) is a boolean eed. Thus: 


3.5. DEFINITION. A (set-valued) model of the boolean eed (T, B) is a 
morphism of boolean eed’s 


(T, B)— P((Sets)), 


(where a morphism of boolean eed’s (T, B)— (T’, B’) is a product preserv- 
ing functor M:T-—>T between the base categories, as well as, for each 
n€T, a boolean homomorphism B(n)— B'(M(n)), satisfying obvious 
compatibility conditions). 


3.6. To relate this notion of model to the notion of model for an ordinary 
first-order theory JY, one has to take (T,B) to be the ‘Lindenbaum 
doctrine’ of J, where T is the algebraic theory having no other operations 
than the projectons (as studied in 1.12) and with B(n) = Z-equivalence 
classes of formulas whose free variables are among x1,...,X,. Then B(f) 
(for f : n > m) is defined by means of the syntactic process of substitution. 
Existential quantification can be defined in terms of the syntactic 3. 
Sketches of this kind of semantics for eed’s are given in LAwvereE [1970] 
(for the higher order case); see also Coste [1973]. 

The reader familiar with polyadic or cylindric algebras may recognize 
features in the (boolean) eed-approach, and in fact, specific comparison 
theorems have been proved, Joyat [1971]. 
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3.7. However, from the viewpoint of categories as theories, eed’s have the 
defect that they are categories T equipped with something extra, the 
B(n)’s, which is not defined in terms of the category structure, except for 
the case of a subobject eed P(@) (Example 3.4). For which eed’s (T, B), or 
for which first-order theories J does there exist a sufficiently rich cartesian 
category R such that (T, B) (or J) has the same models as the subobject 
eed P(R)? Equivalently, when can the @-valued models of (T, B) (or 7) be 
described as functors R— @ preserving a certain amount of the categorical 
structure? This question was first resolved by Joyal-Reyes—Dionne 
(Dionne [1973] for some of the fragments); see Example 4.4. Dionne’s 
method was to construct R from J. Better proofs of these results have been 
given by Benabou—Coste (Coste [1974]) who construct actual ‘good’ 
categories R from eed’s satisfying the ‘Beck’ (or Chevalley) and ‘Frobenius’ 
laws (terminology of Lawvere [1970]), and apply this construction to the 
Lindenbaum doctrine of ZY (for the appropriate fragment of logic). 
Another technique is due to Freyd via the notion of allegory, a categorical 
axiomatization of the relation calculus. 


4. Logical categories: coherent logic 


The search for categorical versions of first-order logic in the spirit of 
equation theories with categories and models with functors led to the 
notion of regular category with stable sups, or logical category, for short, of 
Joyal-Reyes, Reyes [1974]. (As pointed out in Section 3, Lawvere and 
Volger had studied earlier the related notion of ‘“‘elementary theory” and 
“elementary doctrine’’.) In the context of logical categories, existential 
quantifiers appear as images (in the categorical sense). the connective v 
(‘‘or’’) as a supremum (of subobjects), whereas a (‘‘and’’) and substitution 
(of variables by terms) are cases of inverse limits. 


4.1. Derinition. A logical category T is a cartesian category with 

(a) images which are stable under pull-backs, 

(b) finite sups of subobjects of a given object which are stable under 
pull-backs. 


4.2. RemaRK. (ad (a)) The image of f: A > B is the smallest subobject 
I> B through which f factors. We write f : A - B to mean that the image 
of f is B itself. 

We say that f: A > B is stable, if for every pull-back diagram 


296 KOCK, REYES/CATEGORICAL LOGIC [cH. A.8, §4 
f 
A—B 


| | 


A'—> B' 


the image of f' is B’. 
(ad (b)) We say that Vice, A; = A is stable if for every B— A, 


V B,XxXA;=B. 


ier 


4.3. DEFINITION. A logical morphism between logical categories f : T—> T’ is 
a functor which preserves finite inverse limits, images and finite sups. 
We let Mod;(T) be the full subcategory of the functor category [T, T’] 
consisting of logical morphisms. In the particular case T’ = (Sets), the 
morphisms are called models of T and we define Mod(T) = Mod,sers) (T). 


4.4. ExampLe. Joyal-Reyes-Dionne (DIONNE [1973]) proved that those 
first-order theories whose categories of models (with primitive-predicate 
preserving maps as morphisms) are of the form Modgers)(T) are precisely 
the coherent theories J, as defined by Joyal—-Reyes, in Reyes [1974]; a 
first-order theory is coherent if it can be presented with axioms of the form 


Vx (p(x) > o(x)) 


where ¢ and y# are built from the atomic formulas by means of a, v and 3, 
or are f or | (‘true’ or ‘false’). The notion of coherent theory will be 
crucial in Section 5. The hard part of the proof consists in construction of a 
logical category R starting from J. Basically, there is one object for each 
concept definable in J, and a map for each provably functional relation. 


4.5. ExampLe. The theory of commutative loca! rings is coherent, being 
axiomatizable by equations, and the two axioms 


(i) Vx(t > (Ay: x-y =1v ay: (x-1)-y =D), 

(ii) (l=0> J). 

(The equations are of the required form again by using f.) 

4.6. COMPLETENESS THEOREM (Deligne, GROTHENDIECK [1963-1972], 


Expose 6; Joyal). If T is a logical category and f, g :X =3 Y are two different 
morphism in T, then there is a model M of T such that M(f)# M(g). 
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ProorF (outline). The key idea is to extend T to a new logical category T’ 
having the property that every A — 1 has a section (i.e. T satisfies a weak 
axiom of choice). The key observation in Joyal’s proof is the following: 
whenever A —» 1 in T is given, we can add “universally” a formal section by 
“changing base’, i.e., by considering the category T/A whose objects are 
morphisms B — A and whose morphisms are commutative triangles 


B Ws 


which is again a logical category. The pullback functor T—T/A (which 
sends B into 7,: B X A > A) is conservative and, furthermore A > 1 ET 
has the diagonal A, : A— A XA as a section in T/A. 

By identifying objects with formulas (as in the example) we recognize the 
analogue of a key step in Henkin’s proof: if A is a formula such that 
TtAxA, then T U A(a) is a conservative extension of T, where a is anew 
constant. 

The rest of the proof proceeds by taking an appropriate ultrafilter. OO 


4.7. REMARK. One important question that can be asked in the categorical 
context is: to what extent does the category of models of T determine T? 
One way of interpreting this question is to ask whether further logical 
structure may be introduced in T without changing the category of models 
of T or whether T is ‘“‘conceptually complete’. We notice at once that 
logical categories are not ‘“‘conceptually complete” in general. Indeed, by 
adding ‘‘formally” to T coproducts and quotients by equivalence relations 
we obtain a logical category T such that the logical inclusion T—> T induces 
(via composition) an equivalence Mod(f)— Mod(T) (see Antonius [1975]). 
(Clearly, e.g. X/R is uniquely interpreted in (Sets) as the quotient of the 
interpretation of X by that of R.) We find here a defect in the usual 
formalizations of many-sorted languages where the “implicitly definable’ 
operations of coproducts and quotients are not explicit. Letting a pretopos 
be a logical category with “‘good”’ coproducts and quotients by equivalence 
relations, we have: 


4.8. CONCEPTUAL COMPLETENESS (MAKKAI and Reyes [1976]). If a logical 
functor f :T > T’ between pretopoi induces (via composition) an equivalence 
Mod T'> ModT, then f is an equivalence. 


Proor. The proof proceeds by introducing syntactical many-sorted presen- 
tations for T,T’ whereby f becomes an interpretation of theories. 
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Compactness and the method of diagrams are then used. A purely 
categorical proof is still lacking. [ 


4.9. RemMaRK. Infinitary logical categories have also been considered 
and Barr [1974] has proved a completeness theorem for Boolean-valued 
models. A Gentzen-type axiomatization for this infinitary coherent logic, 
with applications to completeness theorems (including Barr’s) may be 
found in Maxkkalr and Reyes [1976]. 


5. Grothendieck topoi: infinitary coherent logic 


Topoi (and pretopoi) appeared in algebraic geometry, where they are 
ubiquitous via a generalization of the notion of sheaf over a topological 
space: this is the notion of sheaf over a site, due to Grothendieck. 


5.1. DEFINITION. (i) A site is a cartesian category © together with a notion 
of localization, i.e. for every A €|@| we are given a non-empty class 
Loc(A ) of families of morphisms (A; > A ) je, of ©, called the localizations 
of A which are stable under pullbacks in the sense that for every B > A, 
the family (A,,X B— B)ie, is a localization of B. 
(ii) A sheaf over € is a functor F : €°°? > (sets) satisfying the following 
conditions, for every (f, : Ai — A)ier: © Loc(A): 
(a) if 7 € F(A) are such that & = F(fi)(€)= F(fi)(n) = 1. for all 
iE, then = 7. 
(b) if (&)ie, is a family such that & € F(A;) for all i€ J and is 
compatible, i.e. in the diagram 


F(m,) 


& € F(A,) 3 F(Ai,x Aj) <2 F(A)) 3 & 
obtained via F from 
A; —— Ai, A; —& A; 


we have F(7:)(&) = F(7;)(&) for all i,j © J, then there is € € F(A) 
such that & = F(f,)(€), for all iE I. 
We let Sh(@) be the full subcategory of the functor category [€°, (Sets)] 
consisting of sheaves over @. 
(iii) A Grothendieck topos is a category equivalent to one of the form 
Sh(€), for a small site @. 
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5.2. REMARK. (i) These notions aliow us to give a meaning to the intuitive 
idea that some concepts, e.g. ‘‘real-valued continuous functions with open 
domain” are of local character. In this example, we identify this concept 
with the functor Cr: Open(X)*?— (Sets) which to every open set UE 
Open(X) associates its ‘‘extension” Ca(U) = {f: U>R|f iS continuous}. 
The ordered category Open(X) may be considered as a site, by defining a 
localization of U as an open cover of U. 

The local character of the concept in question are just the statements (a) 
and (b). 

(ii) A site structure and the corresponding notion of sheaf may be 
defined on a category €, even if € does not have finite inverse limits. The 
definition is technically more involved. Examples of this type appear in 
model theory (see 5.3(ii)). 

(iii) One of the main theorems of topos theory is Giraud’s (GROTHEN. 
DIECK [1963-1972], Exposé 4) which characterizes a topos as an %-pretopos 
with a set of generators (where an ©-pretopos is a category with finite 
inverse limits, ‘“‘good”’ infinite coproducts and quotients by equivalence 
relations). In particular, any topos has a canonical site structure given by 
(Ai > A )ier is a localization iff the unique map 


|| 
ie€él 


Ai >A 


is epimorphic. (““Goodness” is precisely the condition that this type of 
families are localizations.) This is sometimes called the arbitrary cover 
localization. 


5.3. ExaAmpLe. (i) Let B be a complete Boolean algebra, considered as a 
category. We define the canonical localization as follows: (a;)ie: Covers a 
iff a = Viera. 

Then Sh(B) is equivalent to the category of Boolean-valued sets and 
Boolean-valued maps of Scott-Solovay’s universe V. 

This example opens the way for a category-theoretic approach to 
independence results in set theory: see Hiccs [1976], TieRNEy [1972], 
Bomweau [1975], and Bunce [1974]. 

(ii) Let C be a small category. There is a so-called — —-localization on 
it which can be described as follows: a family {CG.>C|i€ I} is a 
—--localization if for each B — C there is an A — B such that for some 
i€ I we have a commutative diagram 
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Ce 


l 


A— B. 


This ——-localization will be used below. 
(iii) Let R be a logical category. We introduce a site structure on R, the 
finite cover localization, by defining for I finite: 


(Ai > A)icr ELoc(A) iff A = V Image(f). 
iel 


The stability conditions on R guarantee that this is a localization. 


5.4. DEFINITION. (i) A functor F:€—@ between two sites is a site 
morphism if it preserves the finite inverse limits and takes localizations of € 
into localizations of 2. We let Cont(@, D) be the full subcategory of the 
functor category [€, @| consisting of site morphisms. 

(ii) A geometric morphism p : € — €' between topoi is a couple (p*, p,), 
where p*: 6'— @ is a left exact functor having p, as a right adjoint (thus 
p* is a site morphism for the canonical site structures on @ and @’). 


Remark. The significance of coherent logic lies in the fact that its logical 
operations (i.e., 3,v,~a,?, |) are preserved by the functor p* for a 
geometric morphism p. See also 6.13. 


5.5. Forcing: the Kripke—Joyal semantics 

From the examples it should be clear that topoi have something to do 
with forcing and generic structures. A. Joyal pointed out that the various 
notions of forcing (Cohen’s, Robinson’s, Kripke’s) are special cases of 
forcing with a site of conditions. Even for a site Open X (Example 5.2(i)) 
the usual definitions break down: in Kripke’s semantics, for instance, the 
validity of a disjunction or existential formula at the ‘‘stage”’ or ‘“‘time”’ a 
are decided ‘‘on the spot’’, i.e. without reference to further ‘‘stages’”’ (or 
“times’’). On the other hand, in topology the existence of cross-sections 
(over the whole space) for sheaves is rare compared to local existence, i.e. 
existence of sections over coverings. I.e., further ‘‘stages’’ must be taken 
into account. 


5.6. DEFINITION. Let € be a site, F a sheaf and a € F(X). X FH Ax g[a] iff 
there is a localization (f; : X; > X)ie: of X anda family (5;);-; such that for 
every iG I, b, © F(X;) and X,|F ¢[b, F(fi)(a)}. 
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5.7. REMARK. A similar clause should be used to define X It ¢; v g2. On 
the other hand, a is decided ‘‘on the spot” and > ,V are dealt with in the 
usual Kripke fashion, e.g. 


X\it(@ > #)(a) iff for every f: Y> xX, 
Yt g[F(f)(a)] implies YI Y[F(f)(a)); 


negation is > |, and | is described by means of objects Y for which 
§ € Cov(Y). 

(ii) One can check that this notion of forcing specializes to the usual 
ones. As an example, assume e.g. WIt*(~ v )f{a] in the sense of 
Rosinson [1971], i-e., 


Vl > BE Mod(T) IB & £ E Mod(T) 


such that £I+t* p[gfa] or Lit* w[gfa]. We may assume that g = @(f). 
Then the family 


(at "4 £), 


is a “co-localization” for — 4. Furthermore, for every £, £It* p[P(f) fa] 
or £\t* &[D(f) fa]. We now use induction. (This example is taken from 
unpublished work by Joyal-Reyes.) 

(iii) In the particular case that the site is a topos @ (with the canonical 
localization) we have a way of interpreting languages (even infinitary ones). 
In particular we may define the notion of a T-model in @, for any theory T 
(in a L..,-language, say), and the category of T-models in @ as well. In case 
€ = (Sets), it amounts to this: its objects are set-theoretical models and its 
morphisms are- ‘‘algebraic’’, i.e. functions f: #— B which preserve the 
basic relations and operation symbols in the sense that (ai,...,a.)€ 
R* > (f(a),...,f(an))€ R® (with a corresponding clause for operation 
symbols). 

For a semantics in terms of extensions, see 6.8. 


5.8. Proposition (Existence of a classifying topos; Reyes [1974], in collab- 
oration with Joyal). For every finitary coherent theory T, there is a topos 
€[T] and a T-model M in €[T] which is generic in the sense that the 
category of T-models of D is equivalent (via M) to the category of geometric 
morphisms of @ into €|T], for every topos Q. 


PROOF (outline). Let R_ be the logical category associated to T (Example 
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4.4). Then €[T] is the category of sheaves for R with the finite cover 
topology (Example 5.3(iii)). O 


5.9. REMARK. (i) The earliest example of the existence of classifying topos 
was given by Hakim [1972], who proved that the Zariski topos is the 
classifying topos for the theory of local rings. 

(ii) This theorem remains true for an arbitrary (i.e. infinitary) coherent 
theory. Furthermore, any topos is the classifying topos of such a theory (see 
Makkal and Reyes [1976]). 


6. Elementary topoi 


The word “‘elementary”’ in this context refers to the fact that the notion 
of elementary topos is an elementary notion: the notion of category with a 
few properties which are expressible in first order terms in the language of 
the theory of categories. It is even an “essentially algebraic” notion (FREYD 
[1972]). 

The search for such a concept, as a foundation of mathematics, was 
initiated in LAwvere [1964], but a more general notion comprising the 
notion of Grothendieck topos, and at the same time revealing the logical 
(even higher-order logical) nature of these, was found in 1969 by Lawvere 
and Tierney; see LAWVERE [1972]. With the simplifications now possible, 
Kock and MIKKELSEN [1974], this very simple and powerful notion can be 
decribed as follows. 


6.1. DeFinition. An elementary topos € is a category with finite inverse 
limits and an object 2 which 

(i) classifies subobjects, 

(ii) is exponentiable. 


6.2. REMARK. Here, (i) means that there is a universal subobject true: 
1— 9, such that any subobject A’>>A of any object A comes about as a 
pull-back 

A—>n 

| | true 

A'—> 1 


for a unique a (“‘the characteristic function of A'”’); and (ii) means that for 
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each A, there is an object 2% (‘the power object of A’’) and a map 
€,:A XN“ >, which establishes for each B a 1-1 correspondence 


AXB—>2 


B mY (6.1) 
A 


in the following way: to B, associate 
Ax Bs Ax 4 > 0. 


6.3. ExampLe. The model to have in mind is the category of sets, with 0 
any two-element set, say the set consisting of the two symbols “‘true’’ and 
“‘false’’; then Q% can be identified with the power set of A (since a subset 
A'CA can be completely described by its characteristic function 
a:A—2 given by 
a(a)=true ifaEA; 
= false ifnot. 

What is important is first that each Grothendieck topos is also a model of 
these axioms, secondly that essentially all (‘‘intuitionistically valid’) higher 
order naive set theory follows from the simple axioms of Definition 6.1. 
From 1969 to now, a vast amount of papers etc. demonstrated how various 
notions and constructions of higher order naive set theory can be based on 


and carried out in the setting of an elementary topos. Instead of trying to 
survey all these, we give some detailed examples of constructions. 


6.4. Exampte. It is clear that 2“ depends contravariantly on A, f: A>B 
gives rise to 2/ : 2° + 2% (which in (Sets) is the law which to a subset B’ 
of B associates f-'(B')C A). Also, 2% carries canonically the structure of 
an ordered object (in the set case: the inclusion ordering on the set of 


subsets of A). This structure is a subobject =,, 
=,>°0 2° x 0%. 
It has a characteristic function 
ch(=,): 24x 2% 20 
which, according to the exponential adjointness (6.1) corresponds to a map 
ae +88 qo 


which in the (Sets) case has the effect of associating to the subset A CA 
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the family of all subsets A”C A with A”CA’ (the ‘“down-segment of 
A’'”). The right adjoint V; of 2':2° > % (for f: A — B) can now be 
described as the composite 


1 on av) 
Qe Ss, ge —, gv? —, Q® 


where {-}:B—>” corresponds under exponential adjointness to the 


characteristic map B x B—> 22 of the diagonal B — B x B. In the (Sets)- 
case, the reader will see that the composite has the following effect on 


A'CA 

A'#{A"CA|A"CA}+ {B'CB|f-(B)CA} 
H{bE Bf '({b})CA} 
={bEB|WacA:f(a)=b>aEA} 


which we denote V,(A’). Then it is easy set-theoretically to check that we 
do have a right adjoint for the inverse image formation, i.e. we have 


f\BYCA’ iff B'CV;,(A’. 


Using this ‘‘internal universal quantification”, one can get an 
“intersection’’-construction 


Qe ane 4 
which in (Sets) has the effect of associating to ¥ C N* the set NFCA 
(the intersection of all A'G ¥). We omit that. Using this “intersection 
combinator’’, one can derive the left adjoint 3, (‘direct image formation 
along f’’) as the composite 
4 pu Qn 2) gae —“",Q* 

(t seg defined dually of | seg). In (Sets) this map takes A’C A into the 
intersection of all B‘'C B with f-'(B’')D A’, which clearly is just f(A’). 
(The descriptions here are due to MIKKELSEN [1976]; it illustrates how one 
can derive colimit-type notions, like image-formation 3;, on basis of the 
axioms. For more about the basic theory, see Kock and Wraitu [1971], 
Kock and MIKKELSEN [1974], MACLANE [1975] or WrartH [1975].) 

We now describe how some more complicated notions of higher naive 
set theory get very simple formulations in the setting of elementary 
topoi. Observe that a binary relation on an object A 


RPAXA 
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gives rise to a characteristic function A X A, and then by the 
exponential adjointness to a map r:A—>* (“the down-segment for 
R”’), and conversely, whence we also call a map r:A > 2% a binary 
relation on A. 


6.5. DEFINITION (Ostus [1974]). A relation r: A — 2% is said to be recur- 
sive (or (strongly) well founded) if for every B and g: 0° —B there isa 
unique f: A— B making the diagram 


n* 3y n*® 


r | | g (6.2) 


A—- B 


commute; and r is called (weakly) well founded if for each g:0°—B, 
there is at most one f making (6.2) commute. 


6.6. THEOREM (Mikkelsen; see Ostus [1975a]). The notions of strongly and 
weakly well founded relation agree in any elementary topos. 


Proor. The quite difficult proof consists of course in constructing an f 
making (6.2) commute. Theorem 6.6 thus is the assertion that the principle 
of map-construction by transfinite recursion over (internally) well founded 
relations is valid inside any elementary topos. O 


6.7. REMARK. The fact that the recursion principle is valid in every topos 
may be interpreted proof-theoretically: it is intuitionistically provable. In 
fact, Fourman, Coste [1974], Boiteau [1975] (in collaboration with Joyal) 
have all set up formal systems of higher order intuitionistic type theory 
which are adequate and complete for elementary topoi. Thus elementary 
topoi may be viewed as a categorical version of intuitionistic higher order 
logic. Several results may be interpreted and proven in this context. This is 
discussed in detail in Chapter D.6. 

As far as first-order intuitionistic logic in a topos is concerned, formal 
systems which again are adequate and complete for topoi have been set up 
by Benabou, Boileau, Coste, Dionne, Joyal, Makkai, Oullet, Reyes, 
Robitaille-Giguere (see Coste [1973], OuLLeT [1974], for Gentzen-type 
system, and RosITAILLE-GIGUERE [1975] for a Hilbert-type system). 


6.8. Semantics by extensions 
The good exactness properties and the cartesian closedness which can be 
proved for any elementary topos can be used to prove that any first-order 
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formula (coherent or not) whose primitive relations have been interpreted 
by ‘‘extensions” have themselves ‘‘extensions’’. The notion of extension of 
a formula can be defined as a universal element (in the sense of 
Kripke—Joyal semantics) satisfying the formula. (We think of € as a site by 
taking the finite cover topology.) A systematic description of this technique 
of forming extensions of. formulas and proving inclusion relationships 
between such has been.given by MITCHELL [1972], Benabou (Coste [1973, 
1974]) and Ostus [1973]. The relationship to Kripke-Joyal/semantics is 
analysed in Ostus [1975b]. 


6.9. EXAMPLE. Let R be a ring object in an elementary topos, and form 
GL(, R)C R as the extension of the formula ‘‘x is invertible’ (or formally 
“dy:x-+y=17"). Also, for each natural number n, one may form the 
extension 


[[7((x41 = 0) A+++ a (Xn = 0) R" (6.3) 
as well as the extension 
[[(x: is invertible) v --- v (x, is invertible)]] C R”. (6.4) 


6.10. DEFINITION. We say that R is a field object if for each n, the objects 
described in (6.3) and (6.4) agree (i.e. if R satisfies: “‘when n elements are 
not simultaneously zero, one of them is invertible, and conversely”’). (This 
is a non-coherent notion.) 


6.11. ExampLe (synthetic geometry, Kock [1974]). Using the technique of 
extensions, and other exactness properties of an elementary topos € witha 
field object R, one can, following what one does in the (Sets) case, build 
some of the interesting objects of algebraic geometry, like Grassman- 
nians. We shall illustrate this by constructing the projective (n — 1)-space 
P""'(R). This is simply (6.3) (or (6.4)) modulo the equivalence relation 
induced on this object by the obvious action of GL(1, R) 


P""'(R) = [Lv x; is invertible | / GLA, R). 


Also, one can carve out of the power object of P""'(R) an object of 
(n —2)-planes in P"-'(R), but it is more accessible to computation to 
construct this object by linear and multilinear algebra. In the case P*(R), 
for instance, one could define an object L (“of lines in P’?(R)’’) purely 
algebraically. Also using 33 determinants, one can next form the 
extension of the notion ‘“‘incidence’’, i.e., 
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[{line / passes through point Q]] CL x P?(R). 


One has now a model of a certain two-sorted first order structure with sorts 
“lines” and “‘points’’, and one relation, ‘incidence’, and one may ask 
which sentences in this language hold for the structure thus constructed, in 
particular, does it satisfy the axioms of the notion of projective plane? It 
does, provided suitable of the non-equivalent intuitionistic forms of the 
axioms for “‘projective plane’ are chosen. For details, see Kock [1974]. 
With suitable modifications, like saying ‘invertible”’ instead of ‘‘non-zero”’ 
wherever possible, the whole development can be carried through for a 
local ring object instead of a field object. The interest of this now is when it 
is carried out for the generic local ring object A in the classifying topos # 
for local rings (the Zariski topos). 

It turns out (Kock [1974]) that this generic local ring is in fact a field 
object in the sense explained above, so that the construction of P?(A ) and 
the other Grassmannians can be carried out using not only the technique of 
extensions, but also the assumption that A is a field. This P?(A), which 
as an object of Z is a functor from finitely presented commutative rings 
to sets, agrees with the explicitly described such functor which in 
DeEMAzURE and GapRIEL [1970] is honoured with this name. The theory of 
extensions allows us to get the geometric objects ‘“‘by logical means” 
without having to produce functors and test sheaf-conditions; and it allows 
questions of synthetic geometry to be raised. 


6.12. REMARK. Using the object D C A, the extension [[x* = 0]], as ‘‘the 
infinitesimal object” as well as the fact that Z is an elementary topos and 
thus has exponentiation, there seems to be a possibility of having even a 
synthetic differential geometry in & and related topoi. 


6.13. Arithmetic universe 
A complete axiomatization of this notion (due to Joyal) seems not to 
exist, but the programme for what one wants to be axiomatized is the 
following: all structure of essentially algebraic nature (FREYD [1972]) which 
an elementary topos with natural number object has, and which is 
preserved by inverse image functors f*. So an arithmetic universe is at least 
a pretopos with natural number object N (since f* is a pretopos morphism 
which preserves N), but since an elementary topos with natural number 
object has, and f* preserves, formation of free monoids, free categories on 
a graph of generators, formation of ® product of abelian group objects, 
.., all this may be included into the good axiomatics for arithmetic 
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universe in case it does not already follow from the existence of natural 
number object in a pretopos (which most of it does not). Joyal suggested: 
pretopos where free categories on graphs exist as an axiomatization. A 
morphism of arithmetic universe is a functor which preserves the pretopos 
structure and the added structure (free categories on graphs, say). If one 
includes formation of @ of abelian groups into the structure, one can 
describe the notion of Hopf-algebra object (ring object A with a map 
A—>A@®A< satisfying certain equations) by functorial semantics by 
arithmetic universe. Classifying topos for the notion of Hopf algebra can be 
proved to exist. 


7. Topoi and axiomatic set theory 


Any model m of set theory (ZF, say) defines an elementary topos, €(m) 
the category of m-sets, whose objects are the elements (‘‘sets’’) of m, and 
topoi arising this way are rather special (having the multitude of, say, 
Grothendieck topoi in mind). The question arises whether one can 
characterize topoi of form @(m) in elementary terms (a non-elementary 
characterization was given in LAWvERE [1964]). 

This can also be viewed as the question of relative interpretation of a 
suitable extension of the first order theory of elementary topos into ZF, 
and vice versa. 

The question of characterization (or mutual relative interpretation) was 
solved in 1971 by Cote [1971], MircHELL [1972], and Ostus [1974]. It is a 
matter of building models of set theory, out of a given elementary topos 
having suitable properties. Which properties are suitable? 

First, the following propositions are evident. 


7.1. PRoposiTion. Any topos of form @(m) has the property that the 
terminal object 1 is a generator; and it is non-trivial: 0 not isomorphic to 1. 


That 1 is a generator means: if f,g are two different maps X = Y in 
€(m), then there is a map x :1— X such that fox# g ox, and is clear since 
a map x:1—X corresponds to an element x € X, viewing X as a ‘‘set” 
in m. 

Clearly, the more properties of an elementary topos € one assumes, the 
stronger model of set theory one can build from it. The main idea of the 
construction is starting with an elementary topos with very few extra 
properties, and obtaining a model of a weak set theory. We give Osius’ 
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form (Osius [1974]), which in view of 7.1 is ‘the minimal’. The set theory 
ZO resulting here lies between Zermelo-Thiele and Zermelo—Fraenkel 
(ZF), see Ostus [1974]. 


7.2. THEOREM. Given a non-trivial elementary topos € in which 1 is a 
generator. Then one can build a model of the set theory ZO. 


Before we give (a sketch of) the proof, we motivate and give the relevant 
definition. The main thing is a categorical characterization of transitive 
sets. If A € m, then A is said to be transitive if a € A > a C A. Denoting 
the (restriction of the) € -relation on A by R, and reinterpreting R as a 
map r:A—>% in @(m), we have that r is well-founded (in the sense 
explained in Section 6) and is a monic map, because € is well-founded an 
extensional in Zermelo-Thiele set theory. 

We can even get the converse, and more (for m a ZF model). 


7.3. THEOREM (Mostowski; see Osius [1974]). In order that a relation 
r:A—>% onA € €(m) is isomorphic to the € -relation on a transitive set, 
it is necessary and sufficient that r is well-founded (Definition 6.5) and 
monic as a map. 


7.4, THEOREM (Ostus [1974]). In order that a map f : A,—> A2 between two 
transitive sets in m is the inclusion of A, C A:2, it is necessary and sufficient 
that 


7 | | i (7.1) 


commutes, where r, and r. are the €-relations on A, and Az. 


So we have a structural characterization of transitive sets and of those 
maps between them which are inclusions. 


7.5. PROOF OF THEOREM 7.2 (outline). We start by taking the conclusions of 
Theorems 7.3 and 7.4 as definitions. Let @ be a non-trivial elementary 
topos where 1 is a generator. 


7.6. DEFINITION. An object A € @ equipped with a relation r: A > % is 
called a transtive set object (tr.s.o.) if r is monic and recursive (Definition 
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6.5). A map f : A,— A, between two tr.s.o.’s (A;, 7:) and (Ag, r2) is called 
an inclusion if (7.1) commutes. 

One now goes about proving that the category of tr.s.o.’s in @, with 
inclusions as mappings, is equivalent (as a category) to a lattice (except that 
a maximal element is missing). 

Now a model m(@) for ZO can be constructed as follows: the elements 
in m(@) are equivalence classes of pairs ((A, r), M), where (A, r) is a tr.s.o. 
and M: A — 2? isa map (heuristically, thus M is (characteristic map of) a 
subset M’ of a transitive set A ). The equivalence relation is given by means 
of the lattice structure. 

To describe the membership relation, one utilizes the family of maps 


{(0*xA—>Q|A E| S|} 


given by the topos structure of @. 

A more detailed comparison between mutual strengths of certain 
(stronger) systems of set theory, and some further strengthenings of the 
notion of elementary topos, is given in the works (Osius [1974], CoLe 
(1971] and Mitcuett [1972]). 


8. Other fields of research in categorical logic 


We have not tried to be comprehensive. We would have liked also to 
include comments on: 
8.1. Model completeness via sheaf categories (MACINTYRE [1973]). 
8.2. Logical methods applied to sheaf categories on topological spaces and 
real number objects in elementary topoi (MuLvey [1974], Scotr [1968], 
Stout, Tierney). 
8.3. Topos theoretic interpretation of methods of non-standard analysis 
(Kock and MIKKELSEN [1974], Reyes [1974], TAKAHASHI [1975]). 
8.4. Topos theoretic methods in independence proofs in set theory (TIER- 
NEY [1972], BuNGE [1974]) — although this goes against the trend of letting 
. the “‘topos’’ notion have the lead over the notion “‘model of set theory”. 
8.5. Elementary topoi with a natural number object, and how an internal 
theory of classifying topos can be formed (Diaconescu, Tierney, Joyal, 
Johnstone, Lesaffre, De Kinder, WraItH [1975], where references can be 
found). 
8.6. Combinatorial logic and proof theory in form of cartesian closed 
categories (LAMBEK [1972], SzABo [1977] and Scorr [1972]). 
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The advances in set theory which followed the initial discoveries of 
Gédel and Cohen have sent shockwaves of indépendence results through- 
out several parts of mathematics. The main thrust of this part of the 
Handbook is toward explaining to the mathematician many of the impor- 
tant methods and results which have emerged. 

The introductory chapter is by J. Shoenfield. In it he describes the 
intuitive universe of all sets and the axiomatic set theory ZF to which it 
gives rise. This chapter is strongly recommended for anyone who doesn’t 
know what ZF is, or who does know but has the impression that the axioms 
arise in an ad hoc attempt to eliminate the paradoxes. 

Jech’s chapter, which follows, explains the special status of AC — the 
axiom of choice. It discusses the uses of the axiom of choice in mathematics 
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and how mathematics would look without AC. It also contains an 
introduction to the consistency and independence results surrounding AC. 

A first-order sentence ¢ is consistent with ZF if one cannot prove “9, 
the negation of ¢, from the axioms of ZF. To put it another way, ¢ is 
consistent with ZF if the theory ZF+q@ does not give rise to any 
contradictions. The sentence ¢ is independent of ZF is ¢ is not a theorem 
of ZF, that is, if —@ is consistent with ZF. There are two ways to establish 
the consistency of a mathematical statement with set theory — the easy 
way and the hard way. 

The easy way to prove some sentence ¢ consistent with ZF is to find 
some other sentence w which is already known to be consistent with ZF 
and then prove (in ZF) that & implies y. There are several good candidates 
for & which have emerged. The mathematician who is familiar with these 
need know no logic whatsoever in order to prove consistency results. We 
mention, in particular, Martin’s axiom MA, the continuum hypothesis CH 
and the stronger principle © of Jensen. Martin’s axiom and its uses are 
discussed in M.E. Rudin’s chapter. Uses of CH and © are discussed in the 
chapters of Kunen and Juhasz. 

The hard way to prove a consistency result is to go back to basics and 
build a model. Thus, if we want to show that ¢ is consistent with ZF we 
construct a model Yt of the axioms of ZF (and hence of the theorems of 
ZF) in which ¢ is true. 

The first such construction of a model of ZF was Gédel’s universe L of 
constructible sets. This is a very natural model — it is the smallest model of 
ZF containing all the ordinals. Gédel showed that both AC and the 
continuum hypothesis CH are true in the model L, and hence are 
consistent with set theory. Jensen has studied L in depth and has 
discovered many beautiful properties of this model, like the principle © 
mentioned above. This work of Gédel and Jensen is surveyed in Devlin’s 
chapter. 

After L, the best way to construct models of ZF is Cohen’s method of 
forcing. Cohen invented the method to prove the independence of AC and 
CH. Solovay and others have refined and simplified forcing so that by now 
it is a very powerful and not too complicated method for obtaining 
consistency results. Burgess, in his chapter, explains forcing in a way that 
should allow the non-logician to construct his own independence proofs. 
The early parts of this chapter, on absoluteness, are needed in Devlin’s 
chapter. 

Kunen’s chapter is of a rather different nature in that it requires no 
knowledge of mathematical logic. It discusses infinitary combinatorial 
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principles of set theory, some of which have come from logic, which have 
far-reaching applications. In particular, he shows how to view AC, CH and 
© as progressively stronger ‘‘enumeration” principles, and when each 
should be used. He also discusses Ramsey’s theorem and uncountable 
generalizations of it which have proven important, ending with a discussion 
of large cardinal properties of a combinatorial nature. Other large cardinal 
properties are discussed, with suggested reading, in an appendix. 

Among the chapters in other parts of the Handbook which are relevant 
to set theory are Martin’s chapter on descriptive set theory (in Part C) and 
the last section of Morley’s chapter (in Part A) on homogenous sets, where 
applications of large cardinals to L are discussed. Section 4 of Simpson’s 
chapter (in Part C) discusses some uses of extra set-theoretic hypotheses in 
the study of degrees of unsolvability. 
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1. Introduction 


Ideally, an axiom system is formed as follows. First, we select the basic 
concepts and explain their nature as fully as possible. Then we write down 
axioms for the concepts. If all goes well, our explanation will make it clear 
that the axioms are true. 

We will try to present the axioms of set theory in this fashion. We will 
therefore begin with an explanation of the notion of a set. Our explanation 
may appear surprisingly complicated to the mathematician who feels that 
he understands sets perfectly well. However, we shall see that this 
explanation is quite useful, not only for justifying the axioms of set theory, 
but also for investigating new axioms and for proving theorems about sets. 

The ideas presented here have been developed gradually during the past 
century. Although they are well known to most set-theorists, they have 
rarely appeared in print in a coherent form. This explains the lack of a 
bibiliography. 


2. Sets and set formation 


How shall we explain the notion of a set? As a first approximation, a set 
is a collection of objects. Thus a set is formed by selecting certain objects, 
called the members of the set; and the set is completely determined by its 
members. 

The objects which are members of sets may be of any kind. In particular, 
we want to consider a set as an object and thus to allow it to be a member 
of another set. All objects other than sets which are used as members of 
sets are called urelements. 

Even without using urelements, we can form many sets. We can form the 
empty set 9; the set {0} whose only member is 9; the set whose only 
members are 9 and {9}; and so on. We shall confine our attention to such 
sets, and shall use x, y, and z to represent such sets only. We shall explain 
later why nothing is lost by this. 

We have now reached the following point: a set x is formed by choosing 
the sets which are to be members of x. Are there any restrictions on the sets 
which we may pick? There are, as the paradoxes of set theory show. 

Let us recall the Russell paradox. Lex r be the set whose members are all 
sets x such that x is not a member of x. Then for every set x, 


xE€rex€x. (1) 


Substituting r for x, we get a contradiction. 
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The explanation is not really difficult. When we are forming a set z by 
choosing its members, we do not yet have the object z, and hence cannot 
use it as a member of z. The same reasoning shows that certain other sets 
cannot be members of z. For example, suppose that z € y. Then we cannot 
form y until we have formed z. Hence y is not available as an object when 
z is formed, and therefore cannot be a member of z. 

Putting the matter in a positive way, a set z can have as members only 
those sets which are formed before’ z. Thus for the set r formed above, (1) 
holds only for sets x formed before r; so we cannot substitute r for x. 

Carrying the analysis a bit further, we arrive at the following. Sets are 
formed in stages. For each stage S, there are certain stages which are before 
S. At stage S, each collection consisting of sets formed at stages before S is 
formed into a set.” There are no sets other than the sets which are formed 
at the stages. 

This gives a reasonably clear explanation of the notion of a set in terms 
of the notions of a stage and of before. What can we say about these 
notions? We should certainly expect before to be a partial ordering of the 
stages; and this is the only fact about this relation which we need for our 
axioms. 

Stages are important to us because they enable us to form sets. Thus 
suppose that x is a collection of sets and that S is a collection of stages such 
that each member of x is formed at a stage which is a member of S. If there 
is a stage after all of the members of S, then we can form x at this stage. 
Thus the fundamental question for us is: given a collection § of stages, 
is there a stage after all of the members of S$? 

We would like the answer to this question to be yes whenever possible. 
We know by the paradoxes that not every collection of sets is a set; but we 
have avoided the paradoxes by restricting ourselves to sets which are 
formed at some stage. We do not wish to further restrict the notion of a set 
by not having sufficiently many stages. 

Nevertheless, the answer to our question cannot always be yes. For 
example, if S is the collection of all stages, then there is no stage after every 
stage in S. 

A possible answer to our fundamental! question is this: there is a stage 
after all the stages in S provided that we can imagine a situation in which 


' We should interpret before here in a logical rather than a temporal sense. It is similar to 
what we mean when we say that one theorem must be proved before another. 


? Note that this means that if a set is formed at stage S, it can also be formed at every later 
stage. We could arrange things so that each set is formed at exactly one stage; but there is no 
point in doing so. 
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all of the stages in § have been completed. In the case in which S contains 
all stages, we cannot imagine such a situation, since we can always imagine 
a further stage. At best, this is a very vague answer; for it is not at all clear 
in general what we can or cannot imagine. It does, however, provide a 
useful guide for obtaining more precise principles on which we can base 
axioms. 

Specifically, there are three cases in which our vague answer leads us to 
conclude that there is a stage after each member of S. The first is the case in 
which S consists of a single stage. The second is that in which S consists of 


an infinite sequence So, S,,... of stages. The third case is that in which we 
have a set x and a stage S, for each y in x, and S consists of the stages S, 
for y in x. 


In the first two cases, it is clear that we can imagine a situation in which 
all of the stages in S have been completed. In the third case, we can argue 
as follows. Suppose that as each stage S is completed, we take each y in x 
which is formed at S and complete the stage S,. When we reach the stage at 
which x is formed, we will have formed each y in x and hence completed 
each stage S, in S. 

We have now progressed sufficiently far in our analysis to obtain the 
usual axioms of set theory. There are still many obscure points at which 
further analysis might lead to a better understanding of the axioms or to 
new axioms. We shall call attention to some of these as we proceed. 

It is, of course, possible that there is a completely different analysis of the 
notion of a set, and this might lead to quite a different set of axioms. Up to 
the present, however, there has been no analysis of the notion of a set 
essentially different from that given here which leads to a satisfactory 
system of axioms. 


3. The axioms 


Before turning to the axioms, we must describe the language of set 
theory. This language has set variables x, y, z,... which represent arbitrary 
sets. It also has the symbol € for the membership relation. 

The rest of the notation is logical. We have the symbol = for is identical 
with. We have the propositional connectives: — for not, v for or, a for and, 
— for implies, and = for iff. We have the quantifiers: V for for all, 3 for for 
some, and 3! for for exactly one. The variables following a quantifier may 
be restricted to belong to some set; e.g., Vx © y means for all x in y. 

Of course some of this logical notation could be defined in terms of the 
rest; but we are interested in set theory, not logic. For this reason, we shall 
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not bother to state the axioms of first-order logic, nor even the precise 
definition of a formula of first-order logic. These are discussed at length in 
the introductory chapter of Barwise (Chapter A.1). 

We will recall from logic how operations are introduced. A unary 
operation F is introduced by defining F(x) to be the unique y such that 
v(x, y) (where ¢(x, y) is some formula of the language not containing F). 
More precisely, we first prove Vx Aly g(x, y), and then introduce F by the 
axiom ¢ (x, F(x)). A formula &(F(x)) containing F can then be interpreted 
as an abbreviation of Jy (g(x, y) A &(y)). Binary and n-ary operations are 
treated similarly. 

Remark: We allow an operation to depend upon parameters. Thus we 
might set F(x)=x Uy, so that F depends upon y. 

Now we can turn to the axioms. The first point in our analysis was that a 
set is entirely determined by its members. This is the content of our first 
axiom. 


EXTENSIONALITY AxIoM: Vz (z €x@z €y)>x =y. 


One of the most important points established by our analysis is that 
certain collections of sets are sets. In translating this into our language, we 
face a difficulty: there is no general method of talking about collections in 
this language before we know that they are sets. There are, however, 
certain collections which we can talk about. Given a formula ¢(x), we can 
say certain things about the collection of all sets x such that g(x). In 
particular, we can say that it is a set as follows: Jy Vx (x € y ~ ¢(x)). We 
abbreviate this expression to Set{x: ¢(x)}. 

Our first principle of set existence is: if every member of a collection of 
sets belongs to the set x, then that collection is a set. To see this, suppose 
that x is formed at stage S. Then every member of x is formed before S, 
and hence so is every member of the collection. Hence the collection can 
be formed into a set at stage S. We express this principle in our next axiom. 


SEPARATION AXIOM: Vx (e(x)—> x € y)— Set{x: o(x)}. 


Note that the Separation Axiom is not a single axiom, but an infinite set 
of axioms, one for each formula ¢(x). (The name comes from the fact that 
we are separating those x which satisfy @(x) from those which do not.) 

Our next principle is: the union of all the members of a set x is a set. For 
suppose that x is formed at S. Then every member of x is formed before S, 
and hence every member of a member of x is formed before S. This means 
that every member of the union is formed before S; so the union can be 
formed at S. Our next axiom states this principle. 
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The next principle is: if x is any set, the collection of all subsets of x is a 
set. For suppose x is formed at S. Since every member of x is formed 
before S, every subset of x is formed at S. Thus the set of all subsets of x 
can be formed at any stage after S. 

To state this principle as an axiom, we define: 


xCyevz(zex-zeEy). 
Power Set Axiom: Set{y: y C x}. 


Our next principle is: if F is a unary operation and x is a set, then the 
collection of all F(y) for y € x is a set. To see this, let S, be a stage at 
which F(y) is formed. Then there is a stage S after all the stages S, for 
y & x. At stage S, we can form the desired set. 


REPLACEMENT Axiom: Set{z: dy € x (z = F(y))}. 


Our next axiom guarantees the existence of an infinite set. It is a bit 
complicated because we have no direct way in our language to say that a set 
is infinite. 


InFiniry Axiom: 3x (Ay € x Wz (z€ y) 
AVyExdazExVw(weEzeweEyvwe=y)). 


Let us see why the Infinity Axiom is true. Let x» be the empty set, and for 
each n, let x,., be the set whose members are the members of x, and x, 
itself. We can form x» at any stage; and if x, is formed at some stage, then 
Xn+1 can be formed at any later stage. Suppose that x, is formed at S,. Then 
there is a stage S after all of the S,. At this stage, we can form the set x 
whose members are Xo, X;,.... This x is the set which the Infinity Axiom 
says exists. 

A member y of x is a minimal member of x if y and x have no member 
in common. Our next axiom states that every non-empty set has a minimal 
member. 


ReGcuiarity Axiom: dy (y €x) > dy Ex Vz Ey (zE x). 


We will now show why the Regularity Axiom is true. We say that a stage 
S is minimal for x if some member of x is formed at S but no member of x 
is formed béfore S. If S is minimal for x and y is a member of x formed at 
S, then y is a minimal member of x; for every member of y is formed 
before S and therefore cannot be in-x. 

It will thus suffice to show that for every non-empty set x, there is a stage 
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which is minimal for x. This has often been taken as an evident property of 
stages. However, we shall give a proof (due to Dana Scott) that this follows 
from the facts we already know about stages. 

A set x is grounded if every set containing x has a minimal member. (Of 
course, when we know that the Regularity Axiom is true, we will know that 
every set is grounded.) 

If every member of x is grounded, then x is grounded. For let x € y. If x 
and y are disjoint, then x is the required minimal member of y. If not, then 
y contains a member of x, which is grounded by hypothesis. Hence again y 
has a minimal member. 

For each stage S, let Gs be the set of all grounded sets formed before S. 
This is certainly a set, since it can be formed at S; and it is grounded by the 
previous paragraph. If T is before S, then G; is grounded and is formed at 
T which is before S$; so Gr € Gs. 

Now let x be a non-empty set. Say x is formed at S. Let y be the set of all 
Gr, where T is before S and some member of x is formed at T. This is a 
set, since each such Gr is formed before S. It is non-empty (because x is 
non-empty) and all of its members are grounded. Hence y has a minimal 
member G,;. We claim that T is minimal for x. If not, there is a stage U 
before T at which a member of x is constructed. By the above, Gu € Gr; 
and, since U is before S, Gu € y. This contradicts the choice of Gr. Our 
proof is complete. 

We shall later add one more axiom, the Axiom of Choice. The axiom 
system consisting of these axioms is designated by ZFC. It is generally 
considered as the standard set of axioms for set theory. 


4. Development of set theory 


We are not going to give a detailed development of set theory, but are 
merely going to indicate how the various axioms are used in the develop- 


ment. 
We will use {x: @(x)} for the set of x such that (x). More formally, 
{x: @(x)} is introduced by the axiom 


Vx (x E{x: e(x)} > o(x)). 
Thus before using this expression, we must prove 
aly Vx (x Ey eo e(x)). 


Now if there is such a y, it is unique by the Extensionality Axiom; and the 
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statement that such a y exists is just Set{x: p(x)}. Thus we may use 
{x: p(x)} only when we have proved Set{x: o(x)}. 

We shall write {F(x):-¢(x)} for {y: dx (e(x)a y = F(x)}. Thus the 
Replacement Axiom is Set{F(x): x € y}. 

The first task is to define the usual operations of set theory. The main 
function of the axioms here is to show that the sets needed exist. We sketch 
how this is done. 

First we define the empty set: 


6 ={x:x# x}. 


To show that this exists, let y be any set. (The Infinity Axiom shows that 
there is a set. Alternatively, one can use the usual axioms of logic to 
conclude that there is at least one set.) Then Vx(x#x—x Ey); so 
Set{x: x# x} by the Separation Axiom. 
Next we define 
Un(z) = {x: dy €z (x E y)}, 


P(y)={x: x Cy}. 


These sets exist by the Union Axiom and the Power Set Axiom. We call 
Un(z) the union of z and P(y) the power set of y. 
Now we define the set consisting of x and y: 


{x,y}={z:z=xvz=yh. 


It is a little work to see that this set exists. Define an operation F by 
F(@)=x, F(w)=y if wA@. If v is any set, {F(w): w € v} is a set by the 
Replacement Axiom. Hence by the Separation Axiom, it suffices to choose 
v so that x and y are in {F(w): w € v}. This means that v must contain 
and a set other than 9. It is easy to see that v = P(P(P)) will do. 

Now we can define the Boolean operations: 


x Uy = Un({x, y}), 
xNy={z:zExazey}, 
x-y={z:zExanz€ yh}. 


The last two sets exist by the Separation Axiom. 
Now we can define the set whose members are x,,...,X, by induction 


on n: 
{x,} = {x1, x4}, 


{X1, 0. Xneat = (%1,..., Xn} U {Xn4i}- 


Other set operations are now easily defined. 
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Set theory deals not only with sets, but also with relations and functions, 
which are sets of ordered pairs.” The ordered pair (x, y) is generally not 
thought of as a set; but we can identify it with a set if this set does 
everything we want the ordered pair to do. But all we want the ordered 
pair to do is specify its first and second element. In other words, the only 
essential property of the ordered pair is 


(x, y)=(z,w) Px =ZAY=wW. (2) 
There are several definitions of (x, y) which will achieve this; the simplest is 
(x, y) = {{x}, {x, y}}. 


We leave it to the reader to verify (2). 
We can now define the Cartesian product: 


xXy={(z,w): zExanweE y}. 
To show that this exists, note that 
zExnwEy—>(z,w)E A(A(x Uy)) 


and then use the Separation Axiom. 
One application is the extension of the Replacement Axiom to two (or 


more) arguments: 
Set{F(z,w): zExanwE y}. (3) 


For this, define a new operation G by G((z, w)) = F(z, w) and G(v) = 0 if 
v is not an ordered pair. (This is a legitimate definition by (2).) Then 


{F(z,w):zExaw Ey} ={G(v): vEx xX y}; 


so (3) follows from the Replacement Axiom. 

Now we are going to show that our theory can also deal with the usual 
objects of mathematics. These are usually formed by set-theoretic opera- 
tions from numbers. Now it is well known that all numbers (real, complex, 
integral and rational) can be constructed from the natural numbers (again 
using set-theoretical operations). We will show how to define the natural 
numbers in ZFC. 

It is obvious that each natural number n must be identified with a set; 
which set shall we choose? It is natural to choose a set having n members; 
and the obvious such set is the set of natural numbers less than n. Thus the 
number 0 is identified with the empty set, 1 is identified with {0}, 2 with 


> We identify a function f with the set of all ordered pairs of the form (f(x), x), rather than 
the set of all ordered pairs (x, f(x)). 
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{0, 1}, and so on. This makes it clear that the successor operation must be 
defined by 
Sc(x) = x U {x}. 


Now say that a set is inductive if it contains § and is closed under the 
successor operation. We then define a natural number to be a set which 
belongs to every inductive set. 
It remains to prove the Peano axioms. The only one which gives any 
problem is 
Sc(x) = Sc(y)—> x = y. (4) 


Suppose that Sc(x) = Sc(y). By the Regularity Axiom, {x, y} has a minimal 
element; and by symmetry, we may suppose that this is y. Then xZ y. Now 
x € Sc(x) = Sc(y) = y U{y}; so x © y or x = y. Hence x = y. (For the case 
in which x and y are natural numbers, (4) can be proved without using the 
Axiom of Regularity. However, (4) is sometimes useful for arbitrary sets x 
and y.) 

The Infinity Axiom says that an inductive set exist. From this and the 
Separation Axiom, we see that the set of natural numbers exists. 

We can now see why urelements are unnecessary: all of the objects we 
wish to consider are sets, or at least can be identified with sets. Actually, it 
would require only a little additional effort to reformulate our axiom 
system to allow urelements, and this would be useful for some purposes. 

It is rather suprising that we can define all of the usual objects of 
mathematics and prove their properties in ZFC. Certainly this shows that 
ZFC is a very strong axiom system. We should not, however, make too 
much of this fact. To identify mathematics with ZFC (or to say, somewhat 
mysteriously, that ZFC is a foundation for mathematics) is both useless and 
misleading. It leads one to think that objects which are not definable in 
ZFC are not mathematical objects, and that truths which cannot be proved 
in ZFC are not mathematical truths. This is an unfruitful limitation on 
mathematics. 


5. Ordinals 


Although stages entered into our description of sets, they did not enter 
into the axioms. We shall show that nothing is lost thereby; we can define 
the stages and prove that they have the appropriate properties in ZFC. 

We intend to identify the stages with certain sets, which we call ordinals. 
Roughly speaking, the ordinals are obtained by extending the sequence of 
natural numbers. 
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A set x is transitive if every member of x is a subset of x. An ordinal isa 
transitive set x such that every member of x is transitive. We use Greek 
letters for ordinals. 


5.1. THEOREM. The set 0 is an ordinal. If « is an ordinal, then Sc(a) is an 
ordinal. 


Proor. The easy proof is left to the reader. O 
CoroLLary. Every natural number is an ordinal. 
5.2. THEOREM. Every member of an ordinal is an ordinal. 


Proor. Let x € a. Then x is transitive; so we need only show that every 
member y of x is transitive. Since @ is transitive, x Ca; so y Ga, and 
hence y is transitive. OO 


We define 
a<Boa€EB, 
asxBoa<Bva=B. 
We first show that < partially orders the ordinals, i.e., that 
(a <a), (5) 


a<BraB<yra<y. (6) 


By the Regularity Axiom, @ is a minimal member of {a}. This means that 
aa, which is (5). As for (6), it is a consequence of the transitivity of 
y O 


5.3. THEOREM. If Va (VB <ag(B)— ¢(a)), then Vag(a). 
Proor. We assume that the hypothesis holds and that — ¢ (ao), and derive 
a contradiction. Let 

x={a:a<ara7¢(a)}; 


this set exists because each such a@ is in ao. Moreover, x 4 0; for otherwise 
the hypothesis would give g(a). Thus x has a minimal member a. If 
B<a, then B < ap by (6) and BE x by choice of a. Thus (8). The 
hypothesis now gives g(a), contradicting aEx. O 
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Theorem 5.3 tells us that if we wish to prove ¢(q@) for an arbitrary a, we 
may assume that ¢(B) holds for B < a. A proof by this method is called a 
proof of g(a) by induction on a; the assumption that ¢(8) holds for B <a 
is called the induction hypothesis. 

We now show that < linearly orders the ordinals, i.e., that 


a<Bva=BvB<a. (7) 


We write (7) as C(a, 8B). We prove VB C(a, B) by induction on a. To prove 
VBC(a, B), we prove C(a, B) by induction on B. Thus we are proving 
C(a, B) using the two induction hypotheses: 


Vy <aC(vy,B), (8) 


Vy <BC(a, y). (9) 


Now either a = B ora — B4 Oor B — a4 0.If a = B, then C(a, B). Now 
suppose that a — B# 0. By Theorem 5.2 and the definition of <, there is a 
y such that y<a@ and (y <8). By (8), C(y, B), i-e., either y < B or 
Bs y. Since the former is false, B = y. Since y<a, B<a by (6); so 
C(a, B). A similar proof (using (9) instead of (8)) holds if B — a#0. 

Using this, we can prove 


az=Bpeaces. (10) 


For if a=B, y<a—>y<fB by (6); so aC BP by Theorem 5.2. If 
(a = £), then B <a by (7) and -(B < B) by (5). Thus B € a — B; so 
a(@CB). O 


We say that a is the least ordinal such that g(a) if p(a@) holds anda = B 
for every B such that y(B). There is at most one such a, since a = B and 
B =a imply a = B by (5) and (6). 


5.4. THEOREM. If da g(a), then there is a least ordinal a such that (a). 


Proor. We suppose that there is no such a and prove — ¢(a) by induction 
on a. If »(8), the induction hypothesis shows that —(B <a); soa <8. 
Thus —¢(q@); for otherwise, a would be the least ordinal such that 
g(a). O 


Here are two easy examples of least ordinals: 0 is the least ordinal; Sc(a) 
is the least ordinal B such that a < B. 
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5.5. THEOREM. [If x is a set of ordinals, then Un(x) is the least ordinal a such 
that VB Ex (Bsa). 


Proof. The union of a set of transitive sets is transitive; so Un(x) is 
transitive. Every member of Un(x) is a member of some ordinal in x and 
hence is transitive. Thus Un(x) is an ordinal. The theorem then follows 
from (10). O 


Coro.iary. If x is a set of ordinals, there is an ordinal greater than every 
member of x. 


Proor. Take Sc(Un(x)). O 


By the Corollary, there is an ordinal which is not a natural number. The 
least such ordinal is designated by w. Since w ¥ 0 and 0 is the least ordinal, 


0<a. (11) 
Also 
a<w—Sc(a)<w. (12) 


For if a < w, then Sc(a) = w because Sc(qa) is the least ordinal greater than 
a. Since a <,a is a natural number; so Sc(q@) is a natural number; so 
Sc(a) 4 w. 

From (11) and (12), every natural number is less than w and hence 
belongs to w. But everthing in w is an ordinal less than w and hence a 
natural number. Thus w is just the set of natural numbers. 

Now we turn to definitions of operations by induction. The idea is that 
we wish to define F(a) in terms of a and the F(B) for B < a. All of these 
F(B) can be combined into a single object Fa defined by. 


Fla ={(F(B),B): B <a}. 


5.6. THEOREM. If G is a binary operation, then there is a unary operation F 
such that F(a) = G(a, F{ a) for all a. 


Proof. By an a-function, we shall mean a function f with domain @ such 
that f(8)= G(B, fl B) for all B<a. It is easy to see that if f is an 
a-function and B < a, that f[B is a B-function. 

We show that there is at most one a-function. For this, we assume that f 
and g are a-functions and prove 


B<a- f(B)= (8) 
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by induction on £. If y < B, then f(y) = g(y) by the induction hypothesis. 
Thus f | B = g!1B; so 


f(B) = G(B, fl B) = G(B,g TB) = (8). 


If there is an a-function f, we set F(a) = G(a,f); otherwise, we set 
F(a)=0. (It turns out that this otherwise never occurs.) 

We first show that F(a) = G(a, F | a) holds if there is an a-function. If f 
is the a-function, it is enough to show that F/ a = f. If B <a, then f | Bisa 
B-function; so F(B) = G(B, ff B) = f(B) because f is a B-function. Thus 
Fla=f. 

It remains to show that there is an a-function for each a. We show that 
Fla is an a-function by induction on a. Let f= Fla. If B <a, the 
induction hypothesis and the previous paragraph show that F(B8)= 
G(B, F{ B). This is equivalent to f(8)= G(B,f}B), which is what we 
want. [1 


In practice, Theorem 5.6 justifies any sort of definition in which F(a) is 
given in terms of a and the F(8) for B.< a. For example, let x be a set and 
define F(a) to be {x} if a =0, Un(F(8)) if a =Sc(B), and Un{F(B): 
B < a} otherwise. We easily find a G so that Theorem 5.6 gives this F. The 
importance of this example is that F(w) is a transitive set containing x. For 
by (11) and (12), F(w) is the union of the F(a) for a < w. In particular, 
F(0)C F(w); so x € F(w). If y € F(w), then y € F(a) for some a < . 
Then y C Un(F(a)) = F(Sc(a)) C F(w) by (12). Thus F(w) is transitive. 


5.7. THEOREM. If Wx (Wy €xo(y)— 9(x)), then Wx (x). 


ProoFr. We suppose that the hypothesis holds and that — ¢(z), and derive 
a contradiction. Using the above, select a transitive set w containing z, and 
set v ={x: x Ewa o(x)}. Since z € v,v has a minimal member x. If 
y € x, then y © w (because w is transitive) and y€ v; so ¢(y). By the 
hypothesis, we have ¢(x), contradicting x Ev. OF 


Theorem 5.7 tells us that if we wish to prove ¢(x) for an arbitrary x, we 
can assume that ¢(y) holds for all y € x. A proof by this method is called a 
proof of g(x) by © -induction on x; the assumption that ¢(y) holds for all 
y © x is called the induction hypothesis. 

We now identify the stages with the ordinals and identify the relation 
before with <. We say that a set x is formed at the stage a if x C R(a), 
where the operation R is defined by induction as follows: 
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R(a) = Un{P(R(B)): B < a}. 


We have thus defined the concepts of Section 2 in ZFC. We now proceed 
to prove the appropriate properties of these concepts. 
By the definition of R, 


y€ R(a)o 5p <a(yCR(B)); (13) 


i.e., y isin R(q) iff it is formed before a. It follows that x is formed at a iff 
every member of x is formed before a. This is the first basic property of set 
formation. 

Next, we must show that if every member of a collection is formed 
before a, then the collection is a set. By (13), this is expressed by 


Vx (g(x) > x © R(a))— Set{x: o(x)}. 


This follows from the Separation Axiom. 
Finally, we must show that every set is formed at some stage, i.e., 


Ja (x C R(a@)). (14) 


We prove this by €-induction on x. Let F(y) be the least B such that 
y CR(), or 0 if there is no such B. Using the Corollary to Theorem 5.5, 
choose a greater than every ordinal in {F(y): yEx}. If yEx and 
B = F(y), then y C R(B) by the induction hypothesis and B <a. Thus 
y € R(q@) by (13). We have proved that x C R(a). 

Putting {x} for x in (14), we see that every set belongs to an R(qa). This 
fact if often useful. For example, we can define an operation F on all sets 
by defining it on each R(q@), using induction on a. Another use will be 
mentioned in the next section. Thus we see that stages are useful in proving 
theorems as well as in justifying axioms. 


6. The Axiom of Choice 


A choice function on x is a function f with domain x — {0} such that 
f(y) € y for all y in the domain of f. The Axiom of Choice says that for 
every set x, there is a choice function on x. (More precisely, the Axiom of 
Choice is the translation of that sentence into the language of set theory.) 

Why is the Axiom of Choice true? We already know that Un(x)x x isa 
set and hence is formed at some stage S. Then every pair (z, y) with 
z€yay€x is formed before S. At stage S, we can pick one such (z, y) 
for each y in x — {0} and form the set of these (z, y). This set will be a 
choice function on x. 
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There is one difficulty in this argument: what do we mean when we say 
we can pick one (z, y) for each y? Obviously we do not mean that a person 
can actually pick these pairs, since there may be infinitely many of them. 
Nor do we mean that there is a rule for picking them; for no matter how we 
interpret the word rule, we can think of no reason why such a rule should 
exist for every set x. Thus all we can mean is that there is a collection of sets 
which contains exactly one pair (z, y) for each y. If we interpret a collection 
as being an arbitrary division of the objects available into members and 
non-members of the collection, it is reasonable to claim that such a 
collection exists. 

The Axiom of Choice has many applications in mathematics, some of 
which are discussed in Chapter B.2. In set theory, the most interesting 
applications are to cardinals; so we shall give a brief introduction to 
cardinals. 

We say that x and y are equinumerous, and write x ~ y, if there is a 
one-one mapping of x onto y. It is trivial to verify that this is an 
equivalence relation. We want to associate with each set x a set |x|, called 
the cardinal of x, so that 


IxJ=|ylox~y. (15) 


The first thought is to use equivalence classes, i.e., to set |x|={y: 
y ~ x}. This will not work because {y: y ~ x} is not a set. A solution 
suggested by the last section is to set |x|={y: y € R(a) a y ~ x}, where a 
is the least ordinal such that Jy € R(a)(y ~ x). (This exists, since x ~ x 
and x belongs to some R(a).) While this works perfectly well, there is 
another solution which works even better. We define |x| to be the least 
ordinal equinumerous with x. First, however, we must prove that there is 
such an ordinal. 


6.1. THEOREM. If f is a choice function on P(x), then there is a one-one 
mapping g of an ordinal a onto x such that for all B<a, g(B)= 


f(x —{g(y): v < B}). 
Proor. Define F by induction as follows: 
F(a) = f(x —{F(B): B < a}). 
(Here f(0) can be taken to be any set, say 9.) Then 
x —{F(B): B<a}40 
— F(a)E x AVB <a (F(B)# F(a@)). (16) 
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We first show that x C {F(B): B < a} for some a. If not, (16) shows that 
F maps the ordinals one-one into x. The inverse of F then maps some 
subset of x onto the collection of all ordinals; so this collection is a set by 
the Replacement Axiom. This contradicts the Corollary to Theorem 5.5. 

Let a@ be the least ordinal such that x C {F(B): B < a}. By (16), Fla 
maps @ one-one onto x and hence is the required g. O 


We are thus justified in defining | x | to be the least ordinal equinumerous 
with x. As remarked, |x| is called the cardinal of x. A set is a cardinal if it 
is the cardinal of some set. Every cardinal is an ordinal; and an ordinal a is 
a cardinal iff a =|a|. 

We are now going to examine the = relation on cardinals. We first 
prove 


xC6>|x/<6 (17) 


Define a choice function f on A(x) as follows: if y € P(x) — {0}, let f(y) be 
the least ordinal in y. Let g and a be as in Theorem 5.1. It is easy to check 


that 


B<y—8(B)<8(7). (18) 
We prove 


B<a—>B<g(B) (19) 


by induction on B. Assume that B <a but g(B)< B. By (18), g(g(B))< 
g(8). But since g(8)< B, the induction hypothesis gives g(8) = g(g(8)). 
This contradiction proves (19). 

Now we complete the proof of (17). Suppose 6 <|x|. Clearly |x| <a; so 
5 <a and hence 6 = g(5) by (19). But g(5) € x, so g(8) € 6, i.e., g(5) < 5 
Thus we have a contradiction. 


6.2. THEOREM. If a and B are cardinals, then a < B iff there is a set having 
cardinal B which has a subset having cardinal a. 


Proor. If a <= f, then a C B by (10). Since |a| = a@ and || = 8B, B is the 
desired set. Now suppose that |x| =a, |y|= 8, x Cy. There is a one-one 
mapping of y onto B; it maps x onto a subset z of B. Thus a =|x|=|[z|=s 
B by (17). O 


It is fairly easy to show that every natural number is a cardinal and that w 
is a cardinal. Further cardinals can be obtained by the next theorem. 
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6.3. THEOREM. For all x,|x|<|A(x)]. 


Proor. There is a one-one mapping of x into P(x) which maps y into {y}. 
Thus |x|<|P(x)| by Theorem 6.2. We assume that |x|=|A(x)| and 
derive a contradiction. By (15), there is a one-one mapping f of x onto 
P(x). Let y={z:zExaz€f(z)}. Then y = f(w) for some w. Thus 


wEyowE f(w)oweZy, 


a contradiction. O 


This brief treatment will give the flavor of cardinal theory, but it may not 
make evident the crucial role of the Axiom of Choice. If we did not have 
this axiom, we could define cardinals by the first method mentioned above. 
We could then define a < B (for a and B cardinals) to mean that there is a 
set having cardinal 8 which has a subset having cardinal a. We would not 
then be able to prove that 


a=BvBsa 


for any two cardinals a and B without the Axiom of Choice. 

We conclude this section by showing how Zorn’s Lemma is proved from 
the Axiom of Choice. Recall that a partially ordered set x is inductively 
ordered if every linearly ordered subset of x has an upper bound. Zorn’s 
Lemma says that every inductively ordered set x has a maximal element. 

To prove this, we let f be a choice function on A(x). We define an 
operation F by induction as follows. Let x, be the set of all y in x such that 
F(B)<y for all B < a. Let F(a) = f(x.) |if x. 0, and F(a) =0 if x, = 0. 
Just as in the proof of Theorem 6.1, we show that x, =0 for some a. 
Choose the least such a. If y < B < a, then F(y)< F(B) by the choice of 
F(B); so {F(B): B < a} is linearly ordered. It thus has an upper bound y; 
and y is a maximal element because any larger element would be in x,. 


7. Classes 


We have noted that certain collections of sets which are not sets can 
nevertheless be discussed in the language of set theory. We shall now 
consider this in more detail. 

Henceforth, {x: g(x)} represents the collection of all sets x such that 
(x), even if that collection is not a set. Such a collection is called a class. 
More precisely, if each variable of g(x) other than x is assigned a set as its 
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meaning, then {x: p(x)} represents a definite collection; and any such 
collection is called a class. 

Every set y is a class; for y = {x: x € y}. However, not every class is a 
set. For example, the Russell paradox shows that {x: x € x} is not a set, and 
the Corollary to Theorem 5.5 shows that the collection of all ordinals 
(which is clearly a class) is not a set. A class which is not set is called a 
proper class. 

We want {x: @(x)} to be a defined symbol; that is, we want every context 
containing it to be an abbreviation for an expression in the language of set 
theory. With the new meaning of {x: ¢(x)}, we cannot do this by the 
method of Section 4. Instead, we must examine the contexts in which 
{x: @(x)} may appear. 

We wish to allow the expression {x: p(x)} to appear immediately before 
or after € or =. The case in which it occurs immediately after € is taken 
care of by the definition 


y E{x: o(x)}e ely). 


Before proceeding, we introduce some notation. A term is an expression 
which is either a variable or of the form {x: @(x)}. We use A, B and C to 
represent terms. 

The case in which {x: y(x)} occurs immediately before or after = is 
taken care of by the definition 


A=BeVx(xEAexeEB). (20) 


The contexts x © A and x © B on the right are taken care of by the 
previous definition. 

Strictly speaking, (20) is a definition only when at least one of A and B is 
not a variable; for if they are both variables, then A = B is already an 
expression of the language of set theory. However, (20) is still true in this 
case, as the Extensionality Axiom shows. 

Finally, we take care of the case in which {x: g(x)} immediately 
precedes € by the definition 


{x: p(x)}€ Bo ay (y ={x: o(x)}a y EB). 


Thus {x: ¢(x)} © B cannot be true unless {x: ¢(x)} is a set. This is what we 
would expect; every member of a class is a set. 

One technical point must be taken care of. Since = between terms is 
now defined rather than a logical symbol, the properties of = must be 
proved rather than covered by logical axioms. This proof is a bit tedious 
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but without difficulties; it requires the Extensionality Axiom, but no other 
axioms of set theory. 
A useful class is the class of all sets, defined by 


V={xix =x}. 


Note that A €V is a way of saying that A is a set. In particular, 
{x: g(x)} © V can replace our previous notation Set{x: o(x)}. 

A word of caution is necessary. We can now use {x: ¢(x)} without first 
proving that is is a set. We pay for this by being unable to conclude 
w({x: @(x)}) when we have proved Vy #(y). Since Vy means for all sets y, 
we must first prove that {x: p(x)} is a set. 

In defining the operations and notions of set theory, it is natural to 
extend them to classes. Thus we can now define 


ANB={x:xE€Anx EB}. 


However, some caution is needed. We can define {A} = {x: x = A}; but if 
A isa proper class, then {A} is the empty set (since no set x is equal to A). 

If we are dealing with classes, it is natural to let relations and functions 
be classes (instead of just sets) of ordered pairs. In particular, the 
operations of set theory (as applied to sets) may be thought of as functions. 
For example, the operation U becomes the class of all (x,(y, z)) with 
x = y Uz. (This class is easily seen to be proper.) 

At this point, it is easy to say anything we want about a particular class. 
However, no formula of the language of set theory says anything about all 
classes. We indicate by two examples why this is not a great difficulty. 

The first example is the simplest property of equality: every class is equal 
to itself. To show this, we must prove A = A for an arbitrary term A. For 
this, we note that by (20), A =A is equivalent to Vx (x CEA <x EA), 
and the latter follows from the laws of logic. 

This simple example illustrates the general procedure. We prove that 
something is true of all classes by proving (A) for an arbitrary term A. In 
doing so, we are not proving one formula of the language of set theory, but 
infinitely many formulas, one for each choice of A. This is usually 
immaterial, since with rare exceptions the proof is the same for all A. 

Things are a bit more complicated when we wish to state that a class 
exists. This is illustrated when to try to reformulate Theorem 5.6 to talk 
about classes instead of operations. Let ¢(A, B) result from translating the 
following into the language of set theory: if A is a function with domain 
Vx V, then B is a function with domain the class of ordinals such that 
B(a)= A(a, F | a) for all a. Then the theorem in question is intuitively 
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expressed by VA 3B ¢(A, B). But this is not a formula of the language of 
set theory in any sense, since terms other than variables cannot appear 
after quantifiers. 

What can it mean to prove VA 4B (A, B) in ZFC? An examination of 
the proof we gave of Theorem 5.6 gives the answer. What we must do is 
show how, given any term A, we can produce a term B and then prove 
9 (A, B) in ZFC. 

These procedures enable us to handle all the statements we wish to make 
about classes. There is another possible procedure: we can extend the 
language by introducing variables for classes and allowing these variables 
to appear after quantifiers. This gives a simpler and more straightforward 
solution to the sort of problem we have been discussing. There is a penalty, 
however; the added notation makes added work when it comes time to do 
independence proofs. On the whole, the recent tendency has been to stick 
to the language of set theory and use the methods described in this section. 


8. New axioms 


The most important discovery in set theory in recent years is that many 
of the important unsolved problems of set theory cannot be settled from 
the axioms of ZFC. Among these are the Continuum Hypothesis and the 
Souslin problem. We are thus led to seek new axioms which solve these 
problems. We will try the give the reader an idea of what has been done 
and what remains to be done. 

Where shall we look for new axioms? One idea is suggested by Section 4. 
We found there two principles of the form: every collection of sets 
satisfying certain conditions is a set. When we came to formulating these 
principles as axioms in the language of set theory, we could only say that 
every class satisfying the conditions was a set. That this is not a trivial point 
is shown by the models used in independence prooofs. In these models, 
there is always a set belonging to the model which has a subset which does 
not belong to the model. 

Unfortunately, it is not easy to deal with collections which are not 
classes. Consider, for example, a simple-minded approach: we add new 
variables which represent arbitrary collections. It is now easy to write 
axioms which say that every collection having a certain property is a set. 
We cannot use such axioms, however, until we have axioms telling us that 
there are collections. The obvious such axioms say that every class is a 
collection. When we have introduced these axioms, we are right back 
where we started from. 
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A better idea is to introduce into the language symbols for new 
operations on sets, so that there are more collections of the form {x: p(x)}. 
Of course these operations must be really new; i.e., they must not be 
definable in the language of set theory. Now very few such operations are 
known, and introducing the known ones does not solve any of the open 
problems of set theory. Thus at present, we can do nothing with this 
approach. 

The solution to making use of arbitrary collections may lie in a different 
direction. If g(x) is a formula in some reasonable language, we may think 
of ¢(x) as providing a rule for determining which sets belong to {x: ¢(x)}. 
Now as we noted in discussing the Axiom of Choice, there is no reason why 
a collection should have such a rule. If we could further analyze the notion 
of a collection not formed according to a rule, we might arrive at axioms 
other than the Axiom of Choice which utilize such collections, or at least 
find a more convincing argument for the truth of the Axiom of Choice. 

There is another approach to finding new axioms which has been more 
successful. Recall that in Section 3, we formulated a vague principle on the 
existence of stages and derived three precise principles from it. If we could 
derive further precise principles, we could hope to obtain new axioms. 

The vague principle asserts the existence of stages which are after many 
other stages. Thus in view of the identifications of Section 5, we can expect 
the new axioms to state that there are ordinals which are very large. Since 
these ordinals generally turn out to be cardinals, the axioms are called large 
cardinal axioms. Much work has been done with such axioms; we are only 
going to indicate the direction this work has taken. See also Section 7 of 
Chapter B.3 and its anonymous Appendix. 

Let S be the collection of all stages which can be obtained by the three 
precise principles of Section 3. Clearly the weakest new precise principle 
would be: there is a stage which is after every stage in S. To justify this by 
our vague principle, we must be able to imagine a situation in which all the 
stages in S§ are completed, i.e., in which the three principles of Section 3 
lead to no new stages. Without some sort of further analysis, it is not clear 
whether we can do this. (We thus begin to see the weakness of the vague 
orinciple.) However, let us suppose that our imagination is strong enough 
or the task, and see what new axiom results. 

Our axiom should state that there is an @ so large that if the three 
winciples of Section 3 are applied to ordinals before a and sets formed 

efore a, the resulting ordinals are less than a. For the third principle, this 
ieans that if x € R(@) and f is a mapping of x into a, then Un(range(f)) 
<a (see Theorem 5.5). Now if we assume w < a, then w € R(a); so this 
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implies that if y is a countable set of ordinals less than a, then Un(y) < a. 
This implies that the first two principles of Section 3 lead only to ordinals 
less than a. 

We are thus led to the following definition: a@ is inaccessible if w < a@ and 
if whenever f is a mapping of a set in R(a) into a, then Un(range(f)) < a. 
It is easy to show that every inaccessible ordinal is a cardinal, and that our 
definition of an inaccessible cardinal is equivalent to the one usually given. 

Our first large cardinal axiom states that there is an inaccessible cardinal. 
The first thing to show is that it really is a new axiom, i.e., that it is not 
provable in ZFC. We sketch how this is done. 

What we need is a model of ZFC in which the new axiom is false. Now if 
there is no inaccessible cardinal, the class of all sets furnishes such a model. 
Now suppose that a is the least inaccessible cardinal. Since a and R(a) 
satisfy the three principles of Section 3, we suspect that R(a) is a model of 
ZFC. This is indeed the case; the proof rather resembles our derivation of 
the axioms of ZFC from the principles of Section 3. The fact that the new 
axiom does not hold in R(a@) follows (with some work) from the fact that 
there is no inaccessible cardinal in R(q). ; 

Does the new axiom have any interesting consequences? There is at least 
one. A famous theorem of Gédel says that the consistency of ZFC cannot 
be proved in ZFC. This consistency can be proved from the new axiom. 
The crucial fact has already been stated: if a is inaccessible, then R(a@) is a 
model of ZFC. Once we have a set which is a model of ZFC, it is easy to 
prove that ZFC is consistent. \ 

On the other hand, our axiom contributes nothing to the unsolved 
problems mentioned at the beginning of this section; for the independence 
proofs remain correct when we introduce the new axiom. Thus we must 
look for stronger large cardinal axioms. 

We shall consider one such axiom, which says that a measurable cardinal 
exists. We shall not give the precise definition of a measurable cardinal. 
Suffice it to say that nothing about the definition suggests that a measurable 
cardinal must be large. However, we can prove that measurable cardinals 
are inaccessible, and that they are, in certain senses, much larger than 
inaccessible cardinals. For example, if a is a measurable cardinal, then 
there are a@ inaccessible cardinals less than a. 

What we would really like to do (but are presently unable to do) is to 
reformulate the definition of a measurable cardinal to look like this: @ is 
measurable iff a and R(q) are closed under certain operations. We could 
then try to justify the existence of a measurable cardinal by imagining a 
situation in which these operations produce no new ordinals. 
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We thus see that there is much less reason to believe in the existence of a 
measurable cardinal than in the existence of an inaccessible cardinal. On 
the other hand, the former assumption solves various interesting problems 
in set theory. We mention one result which is likely to interest mathemati- 
cians. If there is a measurable cardinal, then every set of real numbers 
which is the continuous image of the complement of a continuous image of 
a Borel set is measurable. It is certainly surprising that the existence of a 
large cardinal implies the measurability of a set of real numbers. 

We mention here one further axiom (not a large cardinal axiom): the 
Axiom of Projective Determinacy. This axiom solves a great many more 
problems than the existence of a measurable cardinal. On the other hand, 
there is no reason for believing that this axiom is true, except that it is an 
elegant axiom with interesting consequences. 

Thus we see that the more problems a new axiom settles, the less reason 
we have for believing the axiom is true. Moreover, we have no good axioms 
at all which settle the most important unsolved problem, the Continuum 
Hypothesis. We are therefore very far from the goal of solving our 
problems by means of new axioms. Nevertheless, there is no reason to be 
discouraged. If the rather elementary analysis of Section 3 leads us as far as 
it did, there is reason to hope that a deeper analysis will lead to new axioms 
with profound consequences. 
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1. Introduction 


I guess I may assume that the reader has heard about the axiom of choice 
and possibly has even seen a proof of some theorem which makes use of 
the axiom of choice. 

The statement of the axiom of choice is very simple and easy to 
arguments about the foundations of mathematics. 


1.1. Axiom (see Fig. 1). Let ¥ ={A,;:i€ I} be a collection of pairwise 
disjoint nonempty sets. There exists a set C = {x;: i € I} which has exactly 
one element x; common with each A; € F. 


QOS 


Fig. 1. 


Now why has this simple (if not self-evident) axiom generated so much 
controversy? For no postulate since Euclid’s Parallel Axiom aroused so 
much excitement in mathematical circles and led to so many philosophical 
arguments about the foundations of mathematics. 

The answer lies in the nonconstructive nature of the axiom of choice. 
The axiom postulates existence of a set C which has certain properties 
(namely chooses one element x; in each A;), but does not give the slightest 
hint how to construct such a set. On the other hand, all other axioms of set 
theory assert that certain constructions on sets result in new sets, and that 
various totalities of elements, defined in a certain way, are indeed sets. For 
instance, the power set axiom states that for every set X, the collection 
P(X) of all subsets of X is a set. 

In case that the reader does not consider this distinction too important, I 
would like to remind him, that traditionally, until the late nineteenth 
century, existence in mathematics was synonymous with construction. 
Cantor’s alternate proof of existence of transcendental numbers, or 
Hilbert’s solution of Gordan’s problem met with a skeptical reaction and 
even animosity of leading mathematicians of that time. (The tendency to 
emphasize constructions was even stronger in the mathematical public: a 
Professor Hermes devoted ten years of his life to the construction of a 
regular polygon of 65,537 sides, which had been proved to be possible by 
Gauss a century earlier.) 
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The explicit formulation of the axiom of choice is usually ascribed to 
Zermelo. According to FRAENKEL, BAR-HILLEL and Lévy [1973] (which 
gives an excellent account of not only the axiom of choice but of the whole 
axiomatic foundations of set theory) the first explicit allusion to the 
principle of choice was made by Peano in an 1890 paper on differential 
equations, although Cantor had inadvertently applied the axiom before. 

In 1904, Zermelo gave a proof that every set can be well-ordered (a 
linear ordering < of a set S is a well-ordering if every nonempty subset X 
of S has a least element). Earlier, when Cantor invented the theory of 
cardinal numbers, he posed the problem to determine the size of the 
continuum (the continuum hypothesis) and made the assumption that the 
set of all real numbers (the continuum) can be well-ordered, or, what 
amounts to the same, that it can be arranged into a transfinite sequence. 
Needless to say that since no such well-ordering had been constructed, this 
assumption met with a strong opposition. 

To prove that every set can be well-ordered, Zermelo formulated the 
axiom of choice, more or less in the same form as it is used today. In our 
terminology the principle is thus stated as follows: 


1.2, Axiom oF Cuolce. For every family ¥ of nonempty sets, there exists a 
function f such that f(S)€S for each set S in the family F. 


The function f is called a choice function on ¥. Before taking a look at 
Zermelo’s proof, let me show that the formulation 1.2 is equivalent to the 
first formulation 1.1. In fact, 1.1 is a special case of 1.2: if the family ¥ 
consists of pairwise disjoint sets then 1.1 and 1.2 clearly say the same thing. 
Thus let us show that if we assume the choice principle for disjoint 
collections of sets, we can prove the general form 1.2. 

Let ¥ be a family of nonempty sets: ¥ ={X: X € ¥}. To apply 1.1, we 
employ the following trick to ‘“‘make the sets in ¥ disjoint”: For each 
X € &, let Sx be the set of all ordered pairs (X,a), where a€ X: 


Sx ={X}x X. 


Now, the collection {Sx: X € ¥} consists of pairwise disjoint nonempty 
sets, and using 1.1, we can pick one element zx from each Sx. Since for 
each X € &, zx has the form (X, ax) where ax € X, it is easy to see that we 
obtain a choice function f on #¥: we let f(X)=ax for each XE F. 
Now back to Zermelo. The proof of the well-ordering theorem goes 
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roughly as follows: Let S be an arbitrary set; we wish to find a well- 
ordering of S. Using the axiom of choice, we assume that there exists a 
choice function f on the family ¥ of all nonempty subsets of S. By 
transfinite induction one constructs a transfinite sequence (a,: a < 6) in S 
as follows: having constructed the first @ terms of the sequence, 
Qo, Q1,..., Ag... (§ <a@), one looks whether there are any elements of S 
left; whether the set X = S —{a,;: <a} is nonempty. If it is nonempty, 
one chooses a, by means of the choice function f: a, = f(X). This 
procedure is continued until for some ordinal number 96, the set S— 
{ag: € < 6} is empty; in other words, S = {a,: € < 9}. Such enumeration of 
S by ordinal numbers gives a well-ordering of S. 

Now you can see that although we proved Cantor’s assumption that the 
continuum can be well-ordered, we have not produced any well-ordering of 
the reals that one could lay hands on; we have just replaced one dubious 
assumption by another dubious assumption, namely by the axiom of 
choice. 

In fact, the axiom of choice and Zermelo’s well-ordering theorem are 
logically equivalent. Let me show how the axiom of choice follows from the 
assumption that every set can be well-ordered: Let ¥ be a family of 
nonempty sets. We let S be the union of the family ¥: S = U{X: X € F}. 
Using a well-ordering < of S, we define a choice function f on ¥ as 
follows: if X € &, we let f(X) be the least element of X in the ordering <. 


2. Do we need the axiom of choice? 


That of course depends on what kind of mathematics we are engaged in. 
If you are solving differential equations, foliating manifolds or investigating 
groups of large but finite order, you will probably never encounter a 
problem having anything to do with the axiom of choice. However, a large 
part of present day mathematics deals with abstract infinite structures and 
in many areas, mathematicians are more and more concerned with the 
foundations of mathematics. And the axiom of choice is indispensable not 
only in logic (set theory and model theory) but in other modern disciplines 
as well: point set topology, algebra, functional analysis, measure theory. 

To illustrate the use of the axiom of choice, let me consider some 
banality like the following: Everyone knows that the union of countably 
many countable sets is countable. This is used in real analysis all over and 
most people do not even realize that the proof uses the axiom of choice. 
After all, the proof goes like this: 
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Ao Qo0 Gor @o2*** Gon *** 
A, Qio ay Qirz*** Ain’ ** 
(1) 
Am Amo Ami QAm2°* * Amn a 
We are given countably many sets Ao, A1,...,Am,-.. and each A,, is 


countable, thus A,, = {@mn}z-0. Hence we can arrange the elements of the 
union A = U2.,.A, into a countable sequence using the well known 
counting method: 


QAoo, Ao1, Ait, A1o, Boz, Ai2, A22, A21, A20,--- - 


So where is the axiom of choice? 
Let me go over the proof once more. We are given a countable collection 
of sets ; hence we can enumerate the elements of & by integers and have 


A ={Ao, Ai,..., Am +. -}m=o-” 


Each A,, is a countable set and so for each m, there exists an enumeration 
of elements of A,, by integers 


Am = {amo, Ginivcces Gmc ees (2) 


However, there exists more than one enumeration of the set A,,. If En 
denotes, for each m, the set of all enumerations of A,, (of the form (2)), we 
are confronted with the following problem: if we want to apply the 
diagram (1), we have to choose one specific enumeration of A,,, for each m. 
In other words, we have to choose one element from each set E,,. And here 
we are: we need a choice function on the family {Eo, Fi,..., Em,-.-}m=0- 
Naturally, the argument above shows that the axiom of choice is used in 
this particular proof of the theorem that the union of countably many 
countable sets is countable; it does not rule out the possibility of finding an 
alternate proof that would make no reference to the axiom of choice. 
However, it has been established (by methods which I shall discuss later in 
this article) that it is not so; one cannot prove without the axiom of choice 
that the union of countably many countable sets is countable; in fact, one 
cannot even prove that the set of all real numbers is not the union of 
countably many countable sets (!). 

The need for the axiom of choice becomes stronger as we move away 
from the continuum into the world of abstract structures and spaces. I shall 
give below examples of fundamental theorems of abstract algebra and 
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topology whose proofs use the axiom of choice. In some instances, the 
theorems are as strong as the axiom of choice: an example of a statement 
equivalent to the axiom of choice is the Tychonoff product theorem in 
point set topology. 

If we want to use convenience as a criterion for accepting the axiom of 
choice, then its widespread use in many branches of mathematics in the last 
50 years speaks clearly in favor of the axiom. On the other hand, admitting 
that the nonconstructive character of the axiom makes it less self evident 
than other axioms, we have to ask about its formal consistency: does not 
the addition of the axiom to the other axioms of set theory lead to a 
contradiction? This has fortunately been settled by Gédel in 1939: the 
axiom of choice is consistent with the axiomatic set theory. Thirdly, we 
should not be satisfied with formal consistency of the axiom. If we are to 
accept it, we should believe in its plausibility. We should make sure that the 
arguments that use the axiom of choice and the results that are obtained 
with its help are not contrary to our picture of the mathematical universe (I 
would not insist though too much on this point. After all, mathematics 
abounds in ‘“‘counterintuitive’’ examples. Just look at Weierstrass’ con- 
struction of a continuous nondifferentiable function). Finally, again due to 
the nonconstructive character of the axiom it is interesting to find out 
whether the use of the axiom of choice in proofs of certain theorems is 
necessary and to what extent: this raises the questions of relative strength of 
various weaker forms and consequences of the axiom of choice. 

I will address myself first to the question of plausibility. At the first 
glance, the axiom seems to be quite obvious: we are to pick one element 
f(S) from each set S in a given family ¥. 

In popular expositions, the axiom of choice has been often compared to 
elections: we may look at each set S € F as a list of candidates for a given 
office and the election process provides a choice function f that determines 
the elected candidate f(S), for every S & ¥. 

Of course, this analogy gives a justification of sorts for the finite case of 
the axiom, when both ¥ and all S € F are finite. And as we know, we 
cannot apply our intuition based on the finite world indiscriminately to 
infinite sets. Nevertheless, the axiom of choice is demonstrably true if the 
family ¥ is finite, regardless whether the sets S € F are finite or not. 

This is proved by induction on size of ¥. If ¥ consists of one nonempty 
set S, then any function f whose only argument is S and whose value at S is 
(any) element a of S is a choice function on ¥. And that such a function f 
exists follows simply from the fact that S is nonempty: the latter means that 
an a€ S exists. 
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Assuming that the axiom holds for all families of size n, we easily prove 
that it holds as well for families of size n + 1: Let ¥ = {S,,...,S,, Savi} bea 
family of n+ 1 nonempty sets. By the induction hypothesis, the family 
{S,,...,S,} has a choice function g. Since S,., is nonempty, there is some 
a © S,,, and hence there is an extension f of g, defined on ¥, whose value 
at S,+: is a. Clearly, such function f is a choice function on ¥. 

Another case, in which the axiom of choice is demonstrably true, is when 
each S € ¥ consists of a single element. Then ¥ has a choice function and, 
in fact, this choice function is unique. (Thinking again about the analogy 
with elections I cannot help wondering whether this mathematical problem 
of existence of a choice function is not the reason why certain countries 
keep holding elections with exactly one candidate for each office.) 

While we have seen that every finite family ¥ has a choice function, the 
finiteness of the elements of ¥ does not generally guarantee that a choice 
function exists. In some cases, finiteness of the sets S € ¥ helps: especially 
if the sets S are endowed with some internal structure. For instance, if ¥ is 
a family of finite sets of real numbers, then a choice function on F exists: 
For each S € &, let f(S) be the least element of S. On the other hand, if ¥ 
consists of finite sets of sets of real numbers, then there is no apparent way 
of finding a choice function on ¥ (and in fact, the existence of a choice 
function cannot be proved in this case). 

A classical illustration of this point (due to Russell) contrasts the case of 
an infinite set of pairs of shoes with that of an infinite set of pairs of socks. 
While the set of pairs of shoes has an obvious choice function (namely, 
choose the right shoe from each pair), there seems to be no way how to 
choose (without recourse to the axiom of choice) among two socks, 
simultaneously for infinitely many pairs. 


3. The “paradoxical”? decomposition of a ball 


Some objections to the axiom of choice were based on the fact that the 
axiom has paradoxical consequences. Using the axioms, one can obtain 
results that are in conflict with our intuition. The most famous example is 
the following paradox. 


3.1. BANACH-TARSKI PARADOX. Using the axiom of choice, one can cut a 
ball into a finite number of pieces that can be so rearranged that one obtains 
two balls of the same size as the original ball. 
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I will sketch the proof of this ‘‘paradox”’, to show how the axiom of 
choice is used, and to show that there is nothing paradoxical about this 
theorem: The pieces of the ball are just nonmeasurable sets and the 
construction is not much different from the well-known construction of a 
nonmeasurable subset of the real line. 

Of course the pieces in- question cannot be measurable: the theorem 
would then be in contradiction with additivity of measure. Let me first 
recall Vitali’s construction of a nonmeasurable set of real numbers: Let us 
consider the following equivalence relation on real numbers in the interval 
(0, 1]: 

x~y iff x-—~y isa rational number. 

This equivalence relation gives us a decomposition of [0, 1] into equiv- 
alence classes. Using the axiom of choice, we pick one element from each 
equivalence class and collect them into a set M. This set M C [0, 1] cannot 
be measurable: For each rational number r, let M, = {x +r: x © M}. By 
the construction of M, the sets M, are mutually disjoint and every real 
number belongs to some (unique) M,. If M were measurable, each M, 
would have the same measure as M. If M were a null set, then all M, would 
be too, and the real line would be the union of countably many null sets, 
which it is not. On the other hand, if M has a positive measure, then the 
union U{M,: r is rational and 0 <r < 1} of infinitely many disjoint sets of 
the same positive measure has to have infinite measure; this is a contradic- 
tion since the union is included in the interval [0, 2]. 

The Banach-Tarski paradox is based on an earlier theorem of Hausdorff 
that gives a paradoxical decomposition of a sphere: 


3.2, THEOREM (Hausporrr [1914]). A sphere S can be decomposed into 
disjoint sets S= AUBUCUQ such that: 
(i) the sets A, B,C are congruent to each other, 
(ii) the set B UC is congruent to each of the sets A, B,C; and 
(iii) Q is countable. 


I will give a very short sketch of the proof. Some more details are in JEcH 
[1973]; the complete proof can be found in Hausdorff’s book. 

Let us consider two axes of rotation a,,a, of the sphere, and consider 
the group of all rotations generated by a rotation g by 180° about a, anda 
rotation & by 120° about a,. All such rotations can be described by formal 
products (“words”) formed by 9, % and #’, with the specification that 
gy’ =1and = 1 (this is, if you like, the free product of the groups {1, 9} 
and {1, ¢, ’}). 
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First I claim that the axes a, and a, can be chosen in such a way that 
distinct ‘“‘words”’ describe distinct rotations generated by g and ws. To prove 
the claim, it suffices to determine the angle @ between a, and a, so that no 
nontrivial word describes the identity rotation. If a word does describe the 
identity rotation, then @ is a solution of a certain equation. It so happens 
that the equation has only finitely many solutions and it follows that there 
are only countably many angles 6 such that some nontrivial word describes 
the identity rotation. Hence any angle outside this countable set will do. 
The key step in the proof is the decomposition of the group G of all words 
(or rotations generated by g and yw) into three disjoint sets #, B, € 
such that 


A-p=BUE, A-p= &, A=. 


The construction is not difficult and the reader can probably find such 
decomposition himself if he cares to try. 

Now we use the axiom of choice in a similar way we used it in Vitali’s 
construction. Each rotation a € G leaves two points of the sphere S fixed; 
it follows that the set Q of all points on the sphere that are fixed by some 
rotation a € G is countable. The set S — Q is the disjoint union of the 
equivalence classes (‘‘orbits’’) given by the equivalence relation 


x~y iff y=xa forsomea€G. 


By the axiom of choice, there exists a set M which contains exactly one 
element in each orbit. If we let 


A=M-S, B=M -&, C=M-@, 


then the sets A, B, C and Q satisfy the statement of Hausdorff’s theorem. 

To obtain the Banach-Tarski paradox, let us consider the following 
equivalence relation between sets in the three-dimensional Euclidean 
space: X ~ Y iff there is a finite decomposition of X into disjoint sets 
X= X,U--:UX,, and a decomposition of Y into the same number of 
disjoint sets Y = Y,U---U Y,, such that X; is congruent to Yj, for each 
i=1,...,m. 

It is easy to verify that ~ is an equivalence and that if X is disjoint from 
X', Y is disjoint from Y’, X ~ Y and X’= Y’, then XUX'=YUY'A 
more important property is that if X C YC Z and X ~ Z, then X ~ Y. 
(The proof of this property is like the proof of the Cantor—Bernstein 
Theorem: If X C YC Z and |X|=|Z|, then |X|=|Y].) 

Having Hausdorff’s decomposition of the sphere at our disposal, it is not 
too difficult to use the properties of the equivalence relation ~ to prove: 
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3.3. THEOREM (Banach-Tarski (1924)). A closed ball U can be decomposed 


into two disjoint sets 
U=XUY 


such that U= X and U~ Y. 


4. Some uses of the axiom of choice 


In this section I shall discuss some typical applications of the axiom of 
choice in mathematics. I have already mentioned the Tychonoff product 
theorem stating that the topological product of any collection of compact 
spaces is compact. It is not surprising that the product theorem uses the 
axiom of choice; after all, the statement that the cartesian product of any 
collection of nonempty sets is nonempty is just another formulation of the 
axiom of choice. (The elements of a cartesian product are the choice 
functions.) 

In fact the relationship of the Tychonoff theorem and the axiom of 
choice is even closer: it has been shown by Kelley that the axiom of choice 
can be proved if one assumes that Tychonoff theorem is true. Thus the two 
statements are logically equivalent (in set theory without the axiom of 
choice). 

A large number of proofs using the axiom of choice, particularly in 
algebra, follow a similar pattern. The theorem in question is reduced to a 
statement asserting existence of a maximal object in a certain class of 
objects. For instance, let us consider the theorem stating that every vector 
space has a basis. It is rather obvious that a set of vectors is a basis if and 
only if it is linearly independent and moreover maximal among linearly 
independent sets (i.e. there is no larger linearly independent set). Thus, to 
prove the theorem it suffices to show that there exists a maximal indepen- 
dent set. 

Another such example is the Hahn-Banach Theorem in functional 
analysis. One version of the theorem states that any linear functional on a 
subspace of a given vector space can be extended to a linear functional on 
the whole space. Again, if one considers the family ¥ of all extensions of 
the given functional (whether defined everywhere or not), then those 
functionals that are defined everywhere in the space are exactly the 
maximal elements of the family F. 

This phenomenon had been recognized and led to the formulation of a 
general principle, usually referred to as Zorn’s lemma (it was first proved 
by Kuratowski and rediscovered twenty years later by Zorn). 
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Let (P, <) bea partially ordered set. A subset C of P is called a chain in 
P if it is linearly ordered by <. An element u € P is an upper bound of C 
if c =u for every cE C. An element a € P is a maximal element if there 
exists no x € P such that a < x. 


4.1. Zorn’s Lemma. Let (P, <) be a nonempty partially ordered set with the 
property that every chain in P has an upper bound. Then P has a maximal 
element. 


Let us see how Zorn’s lemma can be used in the above examples. In the 
first example, let P be the collection of all linearly independent subsets of 
the vector space, and let X < Y just in case X CY. A chain in P is a 
collection C of independent sets such that for any X, Y € C, either X C Y 
or Y CX. If C is such a chain then the set U{X: X € C} is a linearly 
independent set and is an upper bound of C. Hence (P, <) satisfies the 
assumption of Zorn’s lemma and so has a maximal element B. It follows 
that B is a basis of the vector space. 

The proof of Hahn-Banach Theorem is similar. We let P be the 
collection of all linear functionals extending the given functional, and let 
f <g just in case g extends f. It is easily verified that Zorn’s lemma is 
applicable to (P, <), and a maximal element of P is the required extension. 

Zorn’s lemma is easily proved when we assume the axiom of choice: Let 
(P, <) be a partially ordered set such that every chain has an upper bound. 
We construct an increasing transfinite sequence of elements of P: ao< a, < 
+++< a, <++-. As long as we do not reach a maximal element, we can find 
a yet bigger element (and can choose one), since every chain has an upper 
bound. Eventually, we do reach a maximal element. 

It is worth noting that vice versa, Zorn’s lemma implies the axiom of 
choice. Let me show how to obtain a choice function on a family ¥ of 
nonempty sets, using Zorn’s lemma: Let P be the collection of all choice 
functions on subfamilies of ¥, and let f<g just in case g extends f. 
Applying Zorn’s lemma, one gets a choice function on F. 

I will now discuss one consequence of the axioms of choice which has the 
remarkable property that there is a number of seemingly unrelated 
theorems that are all equivalent to it. 

I assume that the reader is familiar with the notion of Boolean algebra. 
A subset I of a Boolean algebra B is an ideal if 

@) OE L1¢], 
(ii) a EI and b<a implies bE I, 
(iii) if aE I and bE thenat+bel 
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An ideal I in B is a prime ideal if for every a € B, eithera €Ior -a El. 


4.2. Prime IDEAL THEOREM (Tarski). Every Boolean algebra has a prime 
ideal. 


The proof of the prime ideal theorem is a typical application of Zorn’s 
lemma. Once we verify that an ideal in B is prime if and only if it is 
maximal (among ideals in B), we simply apply Zorn’s lemma to the 
collection of all ideals in B. 

Note that the prime ideal theorem readily implies its stronger version: In 
a Boolean algebra, every ideal can be extended to a prime ideal. 

For if I is an ideal in B, we consider the quotient algebra B/I and once 
we get a prime ideal on B/I, we simply take the inverse image of this prime 
ideal under the natural homomorphism h : B > B/I. 

In the particular case that B is the algebra of all subsets of a given set S, 
it follows that every ideal over S can be extended to a prime ideal over S. 
Equivalently, using the dual notions of filter and ultrafilter, it follows that 
every filter over a set S can be extended to an ultrafilter. (In fact, this 
statement is equivalent to the prime ideal theorem.) 

Once we have this formulation of the prime ideal theorem in terms of 
ultrafilters it is immediately clear how important the theorem is in point set 
topology. This is further witnessed by the following fact: The prime ideal 
theorem is equivalent to the Tychonoff theorem for products of compact 
Hausdorff spaces. 

The prime ideal theorem is also an important tool in logic. For instance, 
one of the basic principles in model theory is the Compactness Theorem: If 
every finite subset of a set of sentences X has a model, then & has a model. 
(See 2.4 and 4.2 in Chapter A.1.) It turns out that the Compactness 
Theorem is equivalent to the prime ideal theorem. 

Let me illustrate how the compactness theorem can be used in lieu of the 
axiom of choice in some proofs. Let us prove that every set can be linearly 
ordered. (Since the prime ideal theorem is weaker than the axiom of 
choice, we do not try to prove that every set can be well-ordered.) Let S be 
an arbitrary set. Let us consider a language that provides a name x for each 
element x of S, and has a binary predicate < . Let » be the following set of 
sentences: 

XK X, 
x<y and y<iDx<zZ, for all x,y,z € S. 


x¥<y or y<xXorx=y, 
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Since every finite subset of S can be linearly ordered, it follows that every 
finite subset of } has a model. By the compactness theorem, ¥ has a model 
and this model provides a linear ordering of S. 

For those readers that like point set-topological arguments, let me 
present a sample proof involving a topological version of the prime ideal 
theorem. The following principle appears quite often, in different situa- 
tions and under various guises (one version is called the Rado selection 
lemma): 


4.3. Lemma. Let S be a set, let E be a finite set, and let ¥ be a collection of 
functions t such that 
(i) dom(t) is a finite subset of S and ran(t)C E, 
Gi) ift€ Fand t'Ct, thent'E FS, 
(iii) for every finite X CS there is t€ ¥ such that Xi C dom(t). 
Then there exists a function f : S—> E such that for every finite X CS, the 
restriction f | X belongs to F. 


Proor. To prove this theorem, consider the topological product E* 
(where E has the discrete topology). By Tychonoff’s theorem for Haus- 
dorff spaces, E* is compact. For every finite XCS, let Fx = 
{f © E*’: f |X © F}. Each Fy is closed, and the collection € ={Fx: X CS 
finite} has the finite intersection property. Thus the intersection of @ is 
nonempty and yields an element of F% O 


Since the Rado selection lemma can be used in the proof of the 
compactness theorem, it follows that it is equivalent to the prime ideal 
theorem. 

We note in passing that the Hahn—Banach Theorem is in fact a conse- 
quence of the prime ideal theorem; the full axiom of choice is not neces- 
sary. Another notable consequence of the prime ideal theorem is the 
Stone-Cech compactification theorem. 

When using the axiom of choice it is not always necessary to apply the 
most general version, existence of a choice function on any family of 
nonempty sets. In many proofs, particularly in analysis, it suffices to assume 
that every countable family of nonempty sets has a choice function. 
Strangely enough, this version of the axiom of choice was more acceptable 
to the critics of the axiom, although it is as nonconstructive as the general 
version and it is not clear to me why it should be more plausible. 

One consequence of the countable axiom of choice is of course that the 
union of countably many countable sets is countable (see the proof in 
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Section 2). This fact is frequently used in real analysis and measure theory; 
for instance to prove the basic properties of Borel sets and of Lebesgue 
measure or to show that the ideal of meager sets is closed under countable 
unions. 

The countable axiom of choice is indispensable in descriptive set theory. 
(Without it, it could happen that the continuum is the union of countably 
many countable sets.) It is also sufficient for most arguments in descriptive 
set theory. Still, the favorite of the descriptive set theorists is the following 
stronger principle (the principle of dependent choices): 

If p is a relation on a nonempty set A such that for every x € A there 
exists y€ A with xpy, then there is a sequence {x };,-o of elements of A 
such that 

Xo PX1, X1 PX2, +--+) Xm PXmsiy eee 


The principle of dependent choices implies the countable axiom of choice: 
Given countably many nonempty sets S,, $2, 53,..., let A consist of all 
choice functions on the first n sets S.,...,S, and let. fpg mean that g 
extends f. A sequence f., f2, fs,... such that f, pf, etc. yields a choice 
function on {S,}i-.. Moreover, the principle of dependent choices has 
other useful consequences: for instance, if a linear ordering < is not a 
well-ordering then there exists an infinite descending sequence a) > a,> 
a2 Dee 


5. Consistency of the axiom of choice 


Unlike the question of plausibility, the consistency of the axiom of 
choice is purely a formal problem of the axiomatic set theory. 

A system of axioms % for a mathematical theory is consistent if there 
exists no proof of contradiction based on axioms in 2. As discussed in 
Chapter D.1, the consistency of a strong enough theory (such as set theory) 
cannot be established by methods formalizable in that theory. In other 
words, there can be no formal proof of consistency of axiomatic arithmetic, 
set theory and similar theories. 

However, one can still ask whether a certain axiom A is consistent 
relative to an axiomatic system 2; namely whether, assuming that > is 
consistent, it remains consistent upon adding A to it. Another way to say 
this is that the negation of A is not provable from the axioms of >. In the 
case of the axiom of choice, the question is to show that it is consistent 
relative to other axioms of set theory, that is that it cannot be refuted by a 
proof using the other axioms (unless these axioms are themselves inconsis- 
tent). 
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Consistency of the axiom of choice was proved by Gédel in 1939 (along 
with consistency of the continuum hypothesis). I will outline the main idea 
of Gédel’s proof, but first I would like to say a few words about consistency 
proofs in general. 

The reader is probably familiar with the problem of Euclid’s parallel 
postulate and with the various models of noneuclidean geometries. These 
models establish unprovability of the parallel postulate in geometry by 
satisfying all other geometrical axioms except the parallel postulate. In 
general, in order to show that an axiom is consistent with a theory > (that 
its negation is not provable), one constructs a model of = in which the 
axiom is satisfied. 

I shall outline two methods of getting a model of set theory that satisfies 
the axiom of choice. Both methods are due to Gédel. 

The first model consists of constructible sets. The idea behind construct- 
ible sets is that since the axioms of set theory postulate that various 
constructions can be performed, there must be a minimal collection of sets 
closed under all possible set-theoretical constructions. Thus, one constructs 
the constructible model L (the universe of constructible sets) by transfinite 
induction, starting with the empty set and closing off under set-theoretical 
operations. (For some more details see Chapter B.5 on constructible sets.) 
The constructible model satisfies all the axioms of set theory, and also the 
axiom of choice. The reason why the axiom of choice is true in the model is 
that one can arrange all the sets in the model into a transfinite sequence: a 
constructible set X precedes a constructible set Y if it is constructed before 
Y. In other words, in the model L we have a well-ordering of the universe, 
and so the axiom of choice holds in L. 

The other construction uses definable sets. The model consists of all sets 
that are hereditarily ordinal-definable (HOD). That means sets definable 
by a formula containing ordinal numbers as parameters, and such that all 
their elements are so definable, and all elements of their elements etc. The 
model HOD is closed under set-theoretical operations, and for that reason 
(more or less) satisfies all axioms of set theory. Again, we have a 
well-ordering of the universe in HOD: we may enumerate all possible ways 
of defining a set, and use this enumeration and the natural well-ordering of 
the ordinal parameters to well-order the HOD sets. 


6. Independence of the axiom of choice 


In 1963, Cohen constructed a model of set theory in which the axiom of 
choice is false, thus showing that it cannot be proved from other axioms of 
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set theory. Cohen introduced a new powerful method of construction of 
models, the method of forcing, and used it to prove independence of the 
continuum hypothesis, and the axiom of choice. 

The method of forcing is explained in Chapter B.4, so let me only recall 
the main ideas of the method and some basic facts about generic models. 
One starts with a given model of set theory, called the ground model, and 
tries to extend this model to a larger model that has the same ordinal 
numbers but has new sets that do not belong to the ground model. Since 
the work is done inside the ground model, the new sets to be adjoined to 
the ground model are only hypothetical, or imaginary, and cannot be 
described completely in the ground model. Instead, one singles out some 
conditions that one forces upon the new sets. The key concept is the notion 
of forcing, the relation ‘‘p forces o”’ where p is a forcing condition and 
ao is a sentence involving names of the new sets. Given a sentence a, 
some conditions may force it true and some may force it false, but there is 
always a forcing condition p which decides o, that is either p forces o or 
p forces a. 

To construct a generic model, one postulates existence of a generic set of 
conditions G, a set of forcing conditions that is consistent and for every 
sentence o contains a condition that decides o. A generic set is generally 
not in the ground model. Using G, every sentence of the forcing language is 
declared either true or false, and every name is assigned a definite, 
completely determined set (possibly in the ground model, possibly a new 
set). The collection of all sets so obtained includes the ground model, 
contains G and is closed under all set-theoretical constructions. It is a 
model of set theory and is called a generic extension of the ground model. 
The important property of a generic extension is that it can be described 
inside the ground model: the main theorem states that a sentence is true in 
the extension iff it is forced by some condition in G. 

If the ground model M satisfies the axiom of choice then the generic 
extension M[G] also satisfies the axiom of choice. Thus if we want to 
obtain a model in which the axiom of choice fails we have to do an 
additional construction. The idea is that the new sets in M[G] are very 
much alike and so that there is no definable well-ordering in M[G]. Thus 
we take an infinite collection A € M[G] of, say, new sets of integers 
A ={a:a€A\} and consider a model N = M({a: a € A}) obtained by 
adjoining the sets a € A to the ground model. We have MCNCM[G], 
and since we have not adjoined a well-ordering of A, we expect that there 
is no well-ordering of A in N. 

The construction of N, the symmetric extension of M, uses an idea that 
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goes back to Fraenkel, who already in the 1920’s suggested a method to 
show that the axiom of choice is unprovable. His ideas were worked out by 
Mostowski in the 1930’s who introduced a construction of models known as 
Fraenkel—-Mostowski models, or permutation models. 

While the Fraenkel-Mostowski method gives models in which the axiom 
of choice if false, it does not solve the independence problem in set theory, 
since the underlying universe is not the universe of set theory. The 
Fraenkel-Mostowski universe (see Fig. 2) differs from the ordinary set 
theory by assuming existence of atoms (or urelements), objects that have 
no elements. The Fraenkel—Mostowski universe consists of all sets built up 
from the atoms, while the true set-theoretical universe is built up from the 
empty set. (See the discussion in Chapter B.1.) 


True universe 


Ordinal numbers 
Atoms 
The empty set 


Fig. 2. The Fraenkel-Mostowski universe. 


As shown in Fig. 2, the true universe is a part of the Fraenkel-Mostowski 
universe; I shall call it the kernel. 

The useful property of the atoms is that they are all alike. This fact is 
instrumental in the Fraenkel-Mostowski method. I will now describe the 
simplest example of a permutation model. Let A, the set of all atoms, be 
infinite. Every permutation a of A can be extended to an automorphism 
of the FM universe: when X is a set and z(x) has been defined for all 
x € X, we let 7(X) = {a(x): x © X}. Only the sets outside the kernel are 
moved by 7, since 7(@) = @ and so 7(X) = X for every X in the kernel. 

Let us call a set X symmetric, if there exists a finite set of atoms 
{a,,...,@n} (a support of X) such that whenever 7 is a permutation of A 
such that 7a, = a.,..., 7a, = Gn, then w(X)= X. Let U be the class of all 
hereditarily symmetric sets. The class U is closed under all operations and 
it follows that % is a model of set theory with atoms. Clearly, all sets in the 


362 JECH/AXIOM OF CHOICE [cH. B.2, §6 


kernel are symmetric and so % includes the kernel; &% also includes all 
atoms since every atom is its own support. It turns out that if a set S of 
atoms is in UW, then either S is finite or A — S is finite: it is easy to see that 
an infinite set of atoms with an infinite complement has no support! 

Now it is clear that the axiom of choice fails in U. The set A cannot be 
well-ordered, since otherwise it could be divided into two infinite sets. 

The Fraenkel—Mostowski method is very simple and has been used to 
obtain numerous independence results. However, these results do not give 
any information about the true sets, since the kernel is not affected by the 
construction of a permutation model. In the example I just gave, there is a 
set that cannot be well-ordered, but it is a set of atoms, not a set of real 
numbers or some other genuine mathematical sets. 

The ideas of permutation models can be combined with the forcing 
method and one can thus construct models of true set theory in which the 
axiom of choice fails. As I mentioned earlier, one constructs a symmetric 
model N such that M CN CM[G]. The idea of permutations is employed 
as follows: We consider permutations of forcing conditions (or automor- 
phisms of the corresponding Boolean algebra) and use these permutations 
to construct permutations of names (or automorphisms of the Boolean- 
valued model). Then we extend the ground model M by adding only those 
new sets that have a symmetric name (and so do their elements, and the 
elements of their elements etc.). In this way we obtain a collection 
N C M[G] which includes M because no permutation moves the canonical 
names for sets in the ground model. N is a model of set theory, a symmetric 
extension of M. 

Arguments very similar to those I gave above for a permutation model 
can be used in this method and one can obtain models without the axiom of 
choice. For example, the simplest case of a symmetric model is obtained 
when we use an infinite set A of names of generic sets of integers and use 
permutations of A and finite supports in a similar way as we did in the FM 
case. In the resulting model N = M({a: a € A}), the set A is an infinite set 
of sets of integers and cannot be well-ordered. In fact, no infinite subset of 
A can be well-ordered and so A is an example of a Dedekind set, an 
infinite set that has no countable subset. Since sets of integers can be 
identified with real numbers, we have an example of a Dedekind set of real 
numbers. 

The similarity between permutation models and symmetric models can 
be exploited in a very general fashion and in the next section I shall discuss 
how the results in set theory with atoms can be uniformly transformed into 
independence results in set theory. 
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7. Transfer theorems 


As I mentioned in Section 6, the method of symmetric models of set 
theory uses ideas developed earlier for Fraenkel-Mostowski models. For 
example the model M({a: a € A}) used by Cohen to prove independence 
of the axiom of choice is analogous to the simplest permutation model 
discussed above. Instead of atoms, Cohen’s model uses mutually generic 
sets of integers. However, the analogy between the permutation model and 
Cohen’s model is not complete. We have seen that in the permutation 
model, the set A cannot be divided into two disjoint infinite sets. On the 
other hand, in Cohen’s model, A isa set of real numbers, and every infinite 
set of real numbers can be divided into two disjoint infinite sets (in fact 
every infinite linearly ordered set can). The difference is due to the fact that 
in Cohen’s model, A carries a structure that prevents its elements from 
being completely alike, as is the case with the atoms. 

In another Fraenkel-Mostowski model, the set A is the union of 
countably many pairs {a,,b,} of atoms, and has the property that the 
countable set of pairs {{a,, b,}: n € N} has no choice function. (Here the 
atoms are like the socks in Russell’s example.) Since every family of finite 
sets of real number has a choice function, one cannot hope to construct a 
symmetric model analogous to this FM model, with atoms being replaced 
by real numbers. Still, another model of Cohen gives exactly such result, 
but the sets a,, b, are sets of real numbers. In other words, one gets a better 
analogy between permutation models and symmetric extensions, if the 
place of atoms is taken by more abstract, less distinguishable sets. It turns 
out, that if we let sets of sets of ordinals play the role of atoms, we get a 
quite satisfactory analogy between permutation models and symmetric 
models. The following theorem shows that any permutation model can be 
embedded in a symmetric model of set theory “‘with a prescribed degree of 
accuracy”. 

If S is a set and @ an ordinal number let P*(S) denote the a-th iteration 
of the power set operation: 


P'(S)= P(S)={X: XC S}, 
PS) = P*(S)\UPA(P*(S)), 


PY*(S) = U P,(S) (if @ is a limit ordinal). 


7.1. EMBEDDING THEOREM (Jech-Sochor, see Fig. 3). Let U be a permuta- 
tion model, let A be its set of atoms, and let a be an ordinal number in U. 
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There exists a symmetric model N of set theory and an embedding e: U—>N 
such that P*(A) in U is €-isomorphic to P“(e(A)) in N. 


Fig. 3. 


The model N is a symmetric extension of the kernel M of the model & 
and the set e(A) consists of sets of sets of ordinal numbers of N. 

The import of the embedding theorem is that many independence results 
obtained in the set theory with atoms by Fraenkel-Mostowski method can 
be automatically transferred into ordinary set theory. There is a large class 
of statements (which I will not describe here) to which this transfer method 
is applicable; they are all existential statements of a certain kind. 

A typical example is the statement ‘there exists a set that cannot be 
linearly ordered’. Let &% be a permutation model which has a set that 
cannot be ordered. For simplicity, let us assume that A itself cannot be 
ordered. Then we construct a model of set theory in which &% can be 
embedded so that P“(A) is isomorphic to A“ (e(A )). It follows that e(A) 
cannot be ordered in N, since any ordering of A, a binary relation on A, 
would have to be a member of P*(A). 

There are numerous applications of this transfer method, and all 
statements that are being transferred are existential statements. However, 
this method does not apply to situations when one wants to prove 
independence of one statement from another. Consider this example: we 
want to show that the axiom of choice is stronger than the ordering 
theorem; we want a model in which the axiom of choice fails but still every 
set can be linearly ordered. (This is a result of Lévy.) 

We construct a permutation model (due to Mostowski) and wish to 
transfer the independence result from set theory with atoms into set 
theory. The negation of the axiom of choice transfers easily by the above 
method; the statement “every set can be ordered” does not. However, it is 
possible to investigate the structure of the permutation model and impose a 
similar structure on the corresponding symmetric model. In case of the 
ordering theorem, it suffices to use the finite supports mentioned in Section 
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6, and introduce a similar support structure in N. This itself is enough that 
the statement ‘“‘every set can be ordered”’ can be transferred. 

The transfer method has been perfected in recent years by D. Pincus, 
and although some of the transfer theorems seem to be ad hoc formula- 
tions of specific applications, they cover most independence results ob- 
tained by the Fraenkel-Mostowski method. 

Since the presence of atoms makes Fraenkel—Mostowski models differ- 
ent from models of set theory, it is to be expected that not every 
independence result can be transferred. I shall conclude this section with 
an example of a statement that is equivalent to the axiom of choice in 
ordinary set theory but is known to hold in a permutation model in which 
the axiom of choice fails. 

(The axiom of multiple choice.) Every family ¥ of nonempty sets has a 
function f with the property that f(X) is a nonempty finite set of X, for 
every XE ¥. 


8. Mathematics without the axiom of choice 


Since the axiom of choice is consistent with other axioms of set theory, 
there is no reason why it should not be admissible in mathematical proofs. 
However, since the axiom has a different character than the other axioms, 
it is useful to investigate models of set theory that do not satisfy the axiom 
of choice. The situation is analogous to noneuclidean geometries: by 
studying these models, one learns which theorems have to use the axiom 
and what are the implications among various consequences and weaker 
versions of the axiom. 

Most interesting among the large number of existing results are those 
that deal with real numbers. Particularly interesting is the result of Solovay 
who constructed a model of set theory in which every set of reals is 
Lebesgue measurable. In Solovay’s model, the principle of dependent 
choices holds, and so all the standard theorems of Lebesgue measure 
theory and descriptive set theory can be proved. The universe of Solovay’s 
model is thus very appealing to an analyst, since it lacks the unnatural 
counterexamples like Vitali’s nonmeasurable set, a set without the prop- 
erty of Baire, a dicontinuous additive function etc. 

As I stated earlier, the model constructed by Cohen contains a Dedekind 
set of reals, that is an infinite set that has no countable subset. Existence of 
such set A provides several interesting examples. To start with, it is easy to 
show that the set A has a limit point, a. However, the point a cannot be a 
limit of any sequence in A — {a}, since A is Dedekind. Which shows that 
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the two usual definitions of a limit point (the one using neighborhoods and 
the one using sequences) are not equivalent without the axiom of choice. 

Continuity of real-valued functions is also defined in two ways: the 
e-6-definition is one, and the other that says that limx, =x implies 
lim f(x.) = f(x). Using a Dedekind set of reals, one can construct a 
function which satisfies the limit definition of continuity, but is discontinu- 
ous nevertheless. 

Another interesting model, due to Feferman and Lévy, has the property 
that the set of all real numbers is the union of countably many countable 
sets. 

The permutation models, together with the transfer theorems, are also a 
source of interesting counterexamples. The following examples were 
constructed by Lauchli in permutation models, and transfer to set theory 
by the embedding theorem (except the algebraic closure, whose transfer is 
due to Pincus): 

(a) a vector space that has no basis; 

(b) a vector space that has two bases of different cardinalities; 

(c) a free group whose commutator subgroup is not a free group; 

(d) a field that has no algebraic closure. 

In Section 4, I have discussed at length the prime ideal theorem. Since it 
implies the ordering theorem, and the ordering theorem is not provable in 
set theory (without choice), it follows that the prime ideal theorem is 
unprovable. It follows from the axiom of choice, but it is not equivalent to 
it: Halpern and Lévy showed that the prime ideal theorem does not imply 
the axiom of choice. 

Concerning ultrafilters, Feferman constructed a model in which there is 
no nonprincipal ultrafilter on the set of all integers; Blass extended his 
result by showing that in a related model, there is no nonprincipal 
ultrafilter at all. 

The theory of cardinal numbers becomes quite interesting when the 
axiom of choice is dropped. To start with, without the axiom of choice, one 
cannot prove that any two cardinals are comparable; it can only be proved 
that the relation |X |< | Y| is a partial ordering of cardinals. Not much 
more can be proved about this partial ordering, since by a result of this 
writer, given any partially ordered set, there exists a model of set theory 
which has a set of cardinals isomorphic to the given partially ordered set. 

There is a number of results concerning Dedekind cardinals (cardinals of 
Dedekind sets). For example, it is possible to have an infinite cardinal m 
such that 2" is a Dedekind cardinal. (While it is easy to show that 27” =Np 
for any infinite m.) 
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An old result of Tarski states that if m+ m = m for every infinite m, then 
the axiom of choice holds. A recent construction of Sageev shows that this 
is not the case with the equation m+ m =m. It holds in his model but the 
axiom of choice fails. 

All successor alephs N.+: are regular cardinals; that is if the axiom of 
choice holds. In Feferman—Lévy’s model mentioned above, the cardinal N, 
is singular: there exists an increasing sequence {@o, Q1,..., @n,...} whose 
limit is w,. It is an open problem to construct a model of set theory in which 
every limit ordinal number is the limit of a countable sequence. (It is only 
known that a large cardinal assumption is necessary for this construction.) 

I shall end this section with an interesting combinatorial problem that 
has been a subject of a number of articles and whose solution involves, 
beside the set-theoretical methods discussed above, some number theory 
and some finite group theory. For any integer n, let us consider the 
following statement: 


C,, If ¥ is a family of sets that have exactly n elements, then ¥ 
has a choice function. 


It was Tarski who inspired the investigation of these statements, and 
observed that C, implies C,: 

Let A be a four-element set and let us assume that we have a function f 
that chooses from two-element sets. We shall use f to determine which 
element of A to choose. (We shall get a well-defined procedure, uniform 
for all four-element sets.) There are six two-element subsets of A. For each 
x € A, let q(x) be the number of all pairs {x, y} © A such that f({x, y}) = x. 
Let q be the least such q(x) and let B = {x € A: q(x) = q}. It is easy to see 
that BA A; thus B has 1, 2 or 3 elements. If B has one element, we choose 
this element; if B has 3 elements, we choose the element that is not in B. If 
B has two elements, we choose f(B). 

This nice observation raises a number of questions. The general problem 
is to determine which combinations of C,,’s imply other combinations of 
C,,’s. The various theorems in literature give necessary and sufficient 
conditions (number or group theoretical) for an implication to be provable. 
One special case is the theorem that “‘C, implies C,,”’ is provable if and only 
if n = 1,2 or 4. The only if part was proved by Mostowski for set theory 
with atoms and transfers to set theory by the methods of Jech—-Sochor and 
Pincus. The reader might find interesting that Mostowski’s proof makes use 
of number theory, in particular Bertrand’s postulate. 
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9. Definable choice 


In this section, I will consider the question when one can define a choice 
function. In Section 5, I presented two models of set theory, the construct- 
ible universe L and the model HOD of hereditarily ordinal-definable sets. 
In both models, the universe has a definable well-ordering. (The model L is 
the least model that contains all ordinal numbers; on the other hand, HOD 
is the largest model that has a definable well-ordering.) 

Generic models provide numerous examples on definable choice. There 
exist both models which have nonconstructible sets and still have a 
definable well-ordering of the universe and models which satisfy the axiom 
of choice, but the choice functions are not definable. For instance, in the 
Cohen’s model M[G] from Section 6, the set of all reals does not have a 
definable well-ordering. In a related model, every definable well-ordering 
of a set of reals is countable. 

The following question belongs rather to descriptive set theory: Let A 
be a set of real numbers and let ¥ = {S, : x € A} be a family of nonempty 
sets of reals, such that each S, is defined from x in some uniform way (for 
instance each S, is a IT} set with code x). Can we define a choice function 
{a,:x € A} on ¥? 

Questions of this type arise frequently in descriptive set theory (see 
Chapter B.6). For instance, the uniformization theorem of Novikov- 
Kondo-Addison states that if P is a binary IIj relation on the reals then 
there exists a II} function f which is a subset of P and has the same 


projection (see Fig. 4). 
f 
P 


Fig. 4. 


The picture explains how the uniformization theorem relates to the above 
question. 

No theorem of that sort can be proved for II} sets: The set of all 
nonconstructible reals is [13, and in the Cohen’s model mentioned before, 
no nonconstructible real is definable; this gives an example of a nonempty 
Il; set of reals without definable elements. 
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10. Determinacy — an alternative to the axiom of choice 


Since the axiom of choice has several consequences that may not be 
considered quite desirable (like the existence of nonmeasurable sets), 
attempts have been made to formulate axioms that contradict the axiom of 
choice and have more desirable consequences. (Here again, we have the 
analogy with noneuclidean geometries.) The most interesting alternative to 
the axiom of choice is the axiom of determinacy. 

Every set A of infinite sequences of integers defines the following 
infinite game G, of two players: Player I plays an integer no, player II 
responds by playing an integer n,, then player I plays n2, player II plays n; 
etc. If the resulting sequence {no, m1, n2,...} is in A, player I wins and 
otherwise player II wins. The game G, is determined if either player I has a 
winning strategy or player II has a winning strategy. The axiom of 
determinacy states that for every such set A of sequences, G, is 
determined. 

Using a well-ordering of the set of all sequences of integers, one can 
construct a game that is not determined. Thus the axiom of determinacy 
contradicts the axiom of choice. 

The axiom of determinacy is particularly appreciated by the descriptive 
set theorists. For one thing, it implies the countable axiom of choice, and so 
the basic theorems on real numbers are not affected by the absence of the 
axiom of choice. It also implies that every set of reals is Lebesgue 
measurable, has the property of Baire and is either countable or of 
cardinality 2"°. Moreover, the axiom of determinacy settles various prob- 
lems on projective sets, like uniformization and reduction theorems. 

Apart from the desirable consequences that the axiom of determinacy 
has in descriptive set theory, there is not much to be said in favor of the 
axiom as an alternative to the axiom of choice. For instance, it implies that 
N, and N, are measurable cardinals, Ns, Ns, Ns,... are singular, then X..1 
and N.,.2 are measurable again. Still, the axiom of determinacy is extremely 
interesting. Since it implies various large cardinal properties, one has to 
start with large cardinals if one hopes to prove consistency of the axiom of 
determinacy. This seems to be a hard problem, and so far, we do not yet 
know whether the various consequences of determinacy are consistent. For 
instance, one of the consequences of determinacy is the following state- 
ment: 


(*) Every subset of §, either contains or is disjoint from a closed 
unbounded set. 
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This statement implies that N, is a measurable cardinal. Although this is 
known to be consistent, the statement (*) appears to be much stronger 
(although this is at present just a speculation).* It seems that to establish 
consistency of (*) or of similar statements would be the first step in 
attacking the problem of consistency of the axiom of determinacy, which is 
certainly the most interesting open problem that involves the axiom of 
choice. 
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Introduction 


We consider this chapter to be a short text, rather than a survey — i.e., 
many of the stated theorems are actually proved. Our intent is to introduce 
the reader to the basic methods of combinatorial set theory. 

We assume familiarity with naive set theory, as in HaLmos [1960] or 
Chapter B.1. A knowledge of logic is neither necessary, nor even desirable. 

Except for an occasional trivial remark, none of the results in this paper 
are due to the author. We have made no attempt to attach names to each 
theorem or to refer to original sources; references are intended only to 
indicate areas for further reading. Some concepts have become so inti- 
mately connected with a name (e.g., Laplace transform) that it would be 
impossible to mention them without their founder, but others of far greater 
importance have become part of the folklore of the subject. Those who feel 
that their name has been left out may take solace from the fact that we do 
not invoke Newton or Leibniz each time we use a derivative. 


1. Notation 


We review here some terminology. Greek letters are used for (von 
Neumann) ordinals. cf(a) is the cofinality of a, the least ordinal B such that 
there is an order preserving mapping f : 8 > a whose range is cofinal in a. 
Cardinals are initial ordinals; w, is the a-th infinite cardinal. A* is the least 
cardinal greater than A. |X| is the cardinality of the set X. A cardinal x is 
regular if cf(k)=«; otherwise x is singular. 

P(X) is the set of subsets of X and X” is the set of functions from Y to 
X. X** = U{X!:& <a}. f[X is the restriction of the function f to the 
set X. 

If x and A are cardinals, we perpetuate the standard confusing conven- 
tion of using x* also for |«*| and «~* for |«**|. So when A is infinite, 
k~* =sup{k?:0<A & @ is a cardinal}. exp(x) is used for 2“ when the 
typesetting demands it, as in exp(exp(exp(w2))). 

R is the set of real numbers; @ is the set of rational numbers. 
c =|R|=|PA(@)| = 2°. 


2. C.u.b. and stationary sets 


The feature that distinguishes the subject of this paper from the 
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elementary set theory in high school mathematics is the use of transfinite 
induction on infinite ordinals. 

The purpose of this section is partly to develop your intuition about 
ordinals and partly to introduce some concepts which will be useful later 
on. In many ways, we may think of the ordinals as an extension of the 
natural numbers, but a new phenomenon which appears only at the 
uncountable is that of c.u.b. and stationary sets. 

Some definitions. Fix x a regular uncountable cardinal. If C C x, we call 
C closed iff whenever y <x and CN y is unbounded in y, y € C (equiv- 
alently, C is closed in the order topology). C is c.u.b. iff C is closed and 
unbounded in x. 

Examples of c.u.b. sets are the set of limit ordinals <« and the set of 
limits of limits. {y<«: y is a cardinal} is always closed in x; it is 
unbounded iff « is a limit cardinal (and hence weakly inaccessible, since 
is regular). 

There is an analogy here with measure theory. One should think of c.u.b. 
sets as being large, or almost everything, or of probability measure 1. Then 
the intersection of two large sets should be large. In fact, 


2.1. LEMMA. The intersection of < « c.u.b. subsets of « is c.u.b. 


Proor. Let C; be c.u.b. for € < a, where a < x. It is easy to see that N.C, 
is closed. To see that it is unbounded, define f, : x — « so that f, (y) is the 
least ordinal in C, larger than y. Let g(y)=sup{f;(y): €< a}. g(y)<k 
since x is regular. Define h(y, n) inductively by h(y,0)= y; h(y,n+1)= 
g(h(y,n)). Let y*=sup,h(y,n). Then y<y*<k, and for each 
&C,;N y* is unbounded in v*, so y*E M,C, O 


The reader should note why the above would be nonsense if x were 
allowed to equal w. 

To continue our analogy with measure theory, stationary sets are those 
which are not of measure 0 — i.e., we call S C « stationary iff for all c.u.b. 
C Cx«,SC#0. Then A is non-stationary (i.e. of measure 0) iff AN C = 
0 for some c.u.b. set C. 


2.2. LEMMA. (a) The union of <« non-stationary sets is non-stationary. 
(b) If S is stationary and C is c.u.b., then CS is stationary. 
(c) If a<k, S is stationary, and f :S— a, then f~'{é} is stationary for 
some E<a. 
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The lemma is easily checked using 2.1. Intuitively, (a) says that a union of 
a small number of small sets is small, (b) that the intersection of a set of 
measure 1 with a set of positive measure has positive measure, and (c) that 
a set of positive measure cannot be partitioned into a small number of sets 
of measure 0. 

The analogy with measure theory can only be carried so far. For 
example, we shall see in Section 3 that there is a family of « disjoint 
stationary subsets of x. Note that it is not obvious at this point that one can 
even get 2. 

The following generalization of 2.2c, known as the pressing-down 
lemma, is fundamental in many combinatorial arguments. 


2.3. THEOREM (Fodor). Suppose SC is stationary, f:S—x, and 
Vn € S(f(n)< 17). Then for some €, f~'{&} is stationary. 


A special case of 2.3 is that there can be no 1-1 function f:{n:0< 7 < 
«}— « such that Vn (f(1)< 7). This seems counterintuitive if you think of 
the function n — 1 on the natural numbers — but remember, k > w. 


PROOF OF THEOREM 2.3. Assume there is no such é. Then for each €, there is 
a c.u.b. C, such that Vn EC, NS(f(n) 4 E). Let D={y: VE<n 
(n € C,)}. Then DN S =0, so we shall obtain a contradiction if we show 
that D is c.u.b. As usual, that D is closed is easy. To see that D is 
unbounded, fix y<«. Let yo=y; let yn1, be some ordinal > y, in 
C1{Cy: € < yn} (which is c.u.b. by 2.1). Then sup, y,€D. O 


We conclude with a version of the Lowenheim-Skolem theorem (see 
Chapter A.2). It is well-known that every group has a countable subgroup. 
We shall show that if +: w:;X @:—, is a group operation, then {a < 
w,:(a,*) is a subgroup of (w.,-)} is c.u.b. More generally, a finitary 
function on « is a function from x“ —>« for some k € w. Then: 


2.4. THEOREM. If x > wis regular and f, (n © ) are finitary functions on k, 
then C ={a <x: Wn (a is closed under f,,)} is c.u.b. 


Proor. C is clearly closed. To see that it is unbounded, fix y <«. Say 
fr «= —> x. Let yo= y; let yn+i be the maximum of y,, and sup{f,(s): n © 
w and s €(y,,)"}. Then sup, yx = y andisinCc O 


Although we have avoided explicit use of model-theoretic notions in 2.4, 
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the reader of Chapter A.2 will see that it shows the following: If & is a 
structure for a countable language and 2 has base set «, then {a <x: 
Wf a < W} is c.u.b. in «x. 


3. Enumeration principles 


Under this heading we shall group three related principles of com- 
binatorial set theory. 

The first enumeration principle we discuss is the axiom of choice (AC). 
AC is not provable from the other axioms of set theory (see Chapter B.2). 
Stated in the form that the cartesian product of non-empty sets is 
non-empty, AC seems intuitively obvious, but stated in the form that every 
set can be well-ordered, AC begins to look suspicious, since there is no 
“‘natural’’ way to well-order the real numbers. Nevertheless, in keeping 
with traditional mathematical practice, we shall continue to use AC (as we 
did in Section 2) without further comment. 

AC has a number of consequences in analysis that seem somewhat 
pathological. A simple example is: 


3.1. THEOREM. There is a function f :[0,1]—>[0, 1] whose graph has outer 
Lebesgue measure 1 in the square. 


Of course, the graph of any reasonable function has measure 0, but an f 
satisfying 3.1 may easily be constructed by transfinite induction. Let K. 
(a <c) enumerate all closed subsets of [0, 1] x [0, 1] of positive measure. 
By induction on B <c, pick pg, gg so that: 

(1) a<B—p.# pp, and 

(2) (Per 4a) © Ke. 

Note that the choice of such pg, gg is possible, since by Fubini’s theorem 
{p: 4q ((p, q) © Kg)} has positive measure and hence cardinality c. Take f 
so that f(pe) = qe for all 8. Then the complement of the graph of f contains 
no closed subsets of positive measure, and hence has inner measure 0. 

A deeper application of AC is the Banach-Tarski paradox (BANACH and 
Tarski [1924]), which says that the Earth may be decomposed into finitely 
many pieces and reassembled to form a sphere with the size of the Sun. To 
see this, apply Theorem 3.3 of Chapter B.2, plus the remarks on = there. 

A use of AC more relevant to this paper is: 


3.2, THEOREM (ULAM [1930]). If S C a, is stationary, S may be decomposed 
into w, disjoint stationary subsets. 
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Proor. By AC, pick, for each a < w:,a map f, from w onto a. From n € w 
and & € a, let Ag= S N{a: f.(n) = é}. For each fixed n, the A?(é < @,) 
are disjoint. They may not all be stationary, but it is sufficient to check: 


3.3. LEMMA. For some n, w, of the Aj are stationary. 


Given the lemma, {A?: A{ is stationary} satisfies Theorem 3.2. To prove 
the lemma, note that for each &é, U,,A?={a: a > €} is stationary, so (by 
2.2a), there is an n, such that A is stationary. Now take n so that 
{én =n}l=oa. O 


The w X w, matrix of sets A? is known as an Ulam matrix. 

Theorem 3.2 remains true if w, is replaced by any other regular x > o. If 
« is a successor, the proof is almost verbatim the same. If « is weakly 
inaccessible, then 3.2 for S = « is trivial, since the sets {a: cf(a) = A}, for A 
regular and <x, form the desired partition. However, by Theorem 9 of 
Sotovay [1971], 3.2 holds for any S. 

Our second enumeration principle is the continuum hypothesis, CH. If 
CH is stated by saying that there are no sets of reals of cardinality between 
w and ¢, it is almost believable (try to think of one); but if we use the 
equivalent version that there is a well-ordering of the real numbers such 
that every element has only countably many predecessors, it begins to look 
even more improbable than AC. So we shall not, in this paper, assume CH, 
and we always state it explicitly as an hypothesis when it is used. CH is 
consistent with and independent from the usual axioms of set theory, as 
shown in Chapters B.4 and B.5. 

There are a large number of applications of CH to the real numbers (see, 
e.g., SIERPINSKI [1934]). Broadly speaking, the results fall into two classes. 
The first are really just combinatorial facts always true about w,, which 
under CH, apply to the reals. The second are intrinsically theorems about 
the reals, no shadow of which need remain if CH fails. 

An example of the first class is the following: 


3.4. THEOREM (CH). There is a countable family of functions 


fr :[0, 1] > [0, 1] such thar [0, 1] x [0, 1]=(U,f,) U(U,f;’). 


This should be compared with 3.1; here we identify a function with its 
graph. To prove 3.4, it is sufficient to prove the same result with o, 
replacing [0, 1], in which case CH is not needed. Let, for a < w, g. map w 
onto a +1. Let f.(a@) = g. (n). 
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An example of the second class is the existence of a Luzin set. A subset 
L CR is called a Luzin set iff L is uncountable but L N K is countable 
whenever K is closed and nowhere dense (c.n.w.d.). Such an L has a 
number of peculiar properties. With regard to category, it is not very small, 
since no uncountable subset of L is of first category. However, with regard 
to measure, L is as small as one can get; L has measure 0 under Lebesgue 
measure or any other non-atomic Borel measure (since for all e > 0, there 
is a c.n.w.d. K such that R—K has measure < «). 

That Luzin sets exist is not provable without some extra set-theoretic 
axiom. For example, Martin’s axiom (see Chapter B.6) plus not CH implies 
there are no Luzin sets. But: 


3.5. THEOREM (Luzin [1914]). CH implies there is a Luzin set. 


Proor. Let K, (a < w,) enumerate the c = w, c.n.w.d. sets. Inductively, 
pick pa so that 

(1) @<B— pA Px and 

(2) Pak ize Ke- 

The construction is possible at stage B by the Baire category theorem, 
which states that R is not the union of w c.n.w.d. sets; in particular, 


Now, {ps: 8B <o,} is a Luzin set. O 


One may formulate the basic properties of a Luzin set as intrinsic 
properties of the set itself. Thus, 


3.6. THEOREM (CH). There is an ordering (L, <), such that 
(a) < is a dense total order without endpoints. 
(b) In L, every c.n.w.d. (in the order topology) set is countable. 
(c) L is uncountable. 


Proor. Following the notation of the proof of 3.5, let L ={pg: B< 
w,}U@Q. Then (a) and (c) are immediate, and (b) holds because if K is 
c.n.w.d. in L, the closure of K in R is c.n.w.d.inR. DO 


Most statements about analysis which are provable from CH are 
independent of the negation of CH. This is the case for 3.5 and 3.6. One 
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that is in fact equivalent to CH is given by the following, which we state 
without proof. 


3.7. THEOREM (Erdés). CH is equivalent to the statement: There is an 
uncountable family F of entire functions such that for each complex number 
z, {f(z): f € F} is countable. 


Our next enumeration principle is Jensen’s ©. © asserts: there is a 
sequence (A,: a <w,) such that each A, Ca and 


(*) VA Ca, [{a: A Na = A,} is stationary]. 


© implies CH, since (+) yields WA C w da [A Na = A,], but CH does not 
imply © (Jensen — see DEVLIN and JoHNSBRATEN [1974]). However, © does 
follow from Gédel’s axiom of constructibility (see Theorem 11.2 of 
Chapter B.5). © may also be proved consistent by forcing (see Exercise 
4.18 of Chapter B.4). 

Like CH, © gives an enumeration of countable sets, but now the 
enumeration approximates all subsets of w,. Arguments using © resemble 
CH arguments, but yield stronger results. This can be illustrated by the 
construction of a Suslin line. 

Suslin’s hypothesis (SH) is the assertion that any ordering (L, <) 
satisfying 

(i) < is a dense total order without endpoints, 

(ii) L has the countable chain condition (c.c.c.), i.e., in L there is no 
uncountable family of disjoint open intervals, and 

(iii) (L, <) is Dedekind complete, 
is isomorphic to the real line, (R, < ). SH is consistent with set theory; for 
example, it follows from Martin’s axiom plus not CH (see Chapter B.6), 
and in fact is consistent wtih CH (Jensen — see DEVLIN and JOHNSBRATEN 
[1974]). However, © implies not SH; thus, assuming ©, we shall contruct 
an ordering (L, <) satisfying (i)}-(iii) which is not isomorphic to (R, <); 
such an ordering is called a Suslin line. 

In analogy with 3.6, we shall show: 


3.8. THEOREM (©). There is an ordering (L, <) such that 
(a) < is a dense total order without endpoints. 
(b) In L, every c.n.w.d. set is countable. 
(c*) L is not separable. 


(c*) means that no countable subset of L is dense in L. Before we prove 
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3.8, we point out why it gives us a Suslin line. Let (L, < ) satisfy 3.8. Then L 
satisfies condition (ii), for, if {(a,, b;): i € I} were an uncountable family of 
disjoint open intervals, we could first assume it is maximal (by Zorn’s 
lemma), and then set K = L — U; (a, b,); K would be c.n.w.d., but would 
contain all the a; and b,, violating (b). L will not be Dedekind complete, so 
let L* be its Dedekind completion; it is easy to see that this preserves the 
c.c.c. (but not (b)). Then L* satisfies (i)}-(iii), but cannot be isomorphic to R, 
since this would make L isomorphic to a subspace of R, and hence 
separable. 

As in the case of a Luzin set, we construct the L of 3.8 by induction, 
eventually avoiding all c.n.w.d. sets. But since L cannot be a subspace of R, 
there is no natural ordering given in advance from which to pick points. 
Rather, we must inductively define the order on the points of L as we go 
along. What the points of L actually are is now irrelevant; to simplify 
notation, we might as well take L to be w, as a set; thus, we shall construct 
an order, <J, on w,, to satisfy 3.8. 

We define <1 in steps of w. In the following discussion a and B will range 
over countable limit ordinals. By induction on @ we define an order <1, on 
a so that: 

(i) Each <1, is a dense total order without endpoints, and 

(ii) if a< B, I. =p N(a@ X @). 

(ii) just says that the orderings extend each other. (i) implies that each 
(a, <.,) is isomorphic to the rationals, Q, but the important thing is the way 
the orderings are nested. We may take <., to be any ordering of w ° 
isomorphic to Q. If B is a limit of limits, (ii) forces us to take <, = 
U.<e<,. Likewise, we let a = U,.., <,; then («,, <) satisfies (a) of 3.8. 
The only thing left to specify in the construction is how to define <J,,., 
from <,; we must do this in such a way as to make (b) and (c’) hold. 

(c*) is easy. It is sufficient to guarantee that no B is dense in w, (in the 
order <1). Given <Jg, we let Dg be a proper Dedekind cut in (8, <,) (i.e., Da 
is a proper initial segment of 8 with no supremum). Form <4... by 
inserting a copy of Q into Dg; i.e., the ordinals B + n (n € w) are ordered 
isomorphically to Q@, and placed after the elements of Dg and before the 
elements of 8 — Dg. Then £ will not be dense in B + w, so certainly not in 
w, either. 

We now show how, by judicious choice of the Ds, we may make (@,, 4) 
satisfy (b). The idea is to apply the Baire theorem to avoid all previously 
listed c.n.w.d. sets, but we must first see what the Baire theorem says for 
countable orderings, like Q. If K is c.n.w.d. in @ and D is a proper 
Dedekind cut in Q, we say D avoids K iff there are a<b in Q with 


380 KUNEN/ COMBINATORICS (cu. B.3, §3 


[a,b] K =Oand a < D<b (i.e. a € D, b€ D). Then the Baire theorem 
says: 


3.9. LEMMA. If K, (n € w) are c.n.w.d. in Q, there is a proper Dedekind cut 
D which avoids each K,,. 


Proor. The closures of each K, in R are c.n.w.d., so there is an irrational x 
which is not in the closure of any K,. Let D be the cut defined by x. DO 


Now, let (A. : a < w,) be given by © to satisfy (+). Applying the lemma, 
take each Dy so that it avoids all the A, for a = B such that A, is c.n.w.d. 
in (B, <ig). This completes the description of the construction; we need only 
check that it works — i.e., that (b) holds. 

Let K C a, be c.n.w.d. in (w:, <1). K MN @ need not be c.n.w.d. in (a, 4) 
for all a, but it is for lots of them. To see this, let f,g:@:1—@, and 
h,h': @?— @, be such that for each €€w,—K, f(€)dé<4 g(é) and 
(f(€), g(€)) IK =0, and for each €4n, h(E n),h'(En)E Cn), 
h(é,n)¢ K, and h'(é, 7) € K whenever K N (é, 7) 4 0. Applying Theorem 
2.4, C={a<w,: a@ is a limit and a is closed under f, g, h, h’} is c.u.b. If 
a € C, then closure under f, g, implies that K M a is closed in (a, <..), and 
closure under A implies that it is n.w.d. 

By (*), fix a € C such that K M a = A,. We shall be done if we can show 
that K Ca. This follows by (i) of: 


3.10. LEMMA. For all limits B = a, 

(i) KNB =KNa, and 

(ii) if y € B — K, there are En € a withE Ay dy and (é,n)OK =Oin 
(a1, <). 


Proor. Induction on B. For B =a, use closure under f and g. The 
induction is trivial at limits, so assume 3.10 holds for B; we check it for 
B + w. Now by 3.10 for B, A, = K NB is c.n.w.d. in (B, dp), so Dg avoids 
A.. Thus, there are y <y’ in B such that [y, y']} KM B =0 and all the 
ordinals B + n (n € w) lie in (y, y’). Let & n, &’, n'< a@ arise from applying 
3.10{ii) to y, y’ respectively. Then (é, 7’) K N a = 0, whence, by closure 
of a under h’, (€, 7')N K =0. Since all the B + n lie in (€ 7’), {i) and (ii) 
for 8B + w follow immediately. O 


This completes our discussion of AC, CH, and ©. For a more thorough 
discussion of combinatorial properties generalizing ©, see Chapter B.5 or 
DEVLIN and JOHNSBRATEN [1974]. 
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4. Trees 


Induction along a tree is a generalization of induction on an ordinal and 
is often useful in combinatorial arguments. The theorems proved in this 
section with trees will be fairly elementary. Deeper results will be obtained 
in Section 6. 

Some definitions. A tree is a partial ordering, (T, <), such that for each 
y & T, {x: x < y} is well-ordered by <. The a-th level of T is {y: {x: x < 
y} has type a}. The height of T is the first a such that the a-th level of T is 
empty; then all higher levels are also empty and all lower levels are 
non-empty. 

T' is a subtree of T iff T’ is obtained by pruning off some branches of T; 
ie. T'CT and Vy ET' WxET [x<y>xET’]. 

For each a and set A, the complete A-ary tree of height a, A~*, is 
U{A!: <a} (ie., the tree of < a-sequences from A); in A“~*, s St iff 
s Ct. When talking about such trees, we use s*a (sG A‘, aE A) to 
denote the t € A **' such that t| € = s and t(€) = a; { ) denotes the empty 
sequences and lhs =dom(s) (the length of s); formally, ( )=0 and 
s‘a=sUf{ilhs, a)}. 

The tree T pictured in Fig. 1 is a subtree of the complete binary (2-ary) 
tree of height 3; T has height 3 —i.e., 3 non-0 levels; its 0-th, 1-th, and 2-th 
levels are {( )}, {(0),(1)}, {(1,0),(1,1)} respectively. Note that our 
mathematical tree is what is left of a real tree when one removes the 
leaves, roots, trunk, and branches. 


<1,0> <1,12 


co 
Fig. 1. 


The ordinal a@ is a trivial example of a tree of height a; its &-th level 


is {g}. 
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An example of a simple result proved using induction on 2~° is 


4.1. THEOREM. If X is a closed subset of [0,1] and has no isolated points, 
then |X|=c. 


An X satisfying the hypothesis of 4.1 is called perfect. For the proof, first 
note: 


4.2. Lemma. If X is perfect, there are disjoint Xo, XC X such that X, and 
X, are perfect. 


PROOF OF THEOREM 4.1. Define perfect X, (s € 2“°) as follows: X, = X; 
given X,, let X,-0, Xs". be disjoint perfect subsets of X,.. If f € 2°, let 
X,=,Xsrn Then as f ranges over 2°, the X; are non-empty (by 
compactness) and disjoint, so | X|=c. Since X C[0, 1],|X/=c. O 


An obvious generalization of 4.1, with the same proof, is that if X is any 
compact Hausdorff space with no isolated points, then |X|=c. An 
analogous result using induction on 2““ is: 


4.3. THEOREM (Cech-Pospiiil). If X is compact Hausdorff and no point of X 
is a Gs, then |X|=2”. 


Let us call such an X pluperfect. Pluperfect spaces do not arise frequently 
in applied mathematics. Standard examples are BN — N, the unit ball of the 
dual spaces of L*(R) in the weak* topology, and any product of 
uncountably many compact Hausdorff spaces with more than two points. 
Before proving 4.3, we prove three easy lemmas about pluperfect X. 


4.4. Lemma. If Y CX is a closed G; in X, then Y is pluperfect. 
ProoF. If a point of Y were a G; in Y, it would be a G; in X. O 


4.5. Lemma. If U C X is non-empty and open, there is a non-empty closed 
G; YCU. 


Proor. Fix p € U. Define inductively open V, containing p such that 
UDIWIVIWIViDV2D--:-. Let Y=N,.V,=N,V,. 


4.6. LEMMA. There are disjoint non-empty Xo, X:C X, each of which is 
closed Gs. 
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Proor. Take two disjoint open sets and apply 4.5. O 


PROOF OF THEOREM 4.3. Define closed non-0 Gz’s X, (s © 2”) as follows: 
X,, = X; given X,, let Xo, X «1 be disjoint closed G,’s which are subsets 
of X,; if Ihs is a limit, X,=(M{X,,.: a<Ihs}. If fE2™, let X,= 
(<u Xyra Then the X, (f € 2”) are disjoint and non-empty, so | X|= 
2". O 


The proof of 4.3 easily generalizes to show that if X is compact 
Hausdorff and every point in X has character = x, then | X | = 2*. For more 
on applications of trees to topology, see JuHAsz [1971] and Chapter B.7. 

We now discuss some questions regarding paths through trees. We call P 
a path through T iff P is totally ordered by < and contains exactly one 
element from each non-0 level of T. If T has height @ and is a subtree 
A“*, we identify a path P with the f:a@—A such that P ={fl&: & <a}. 
Trees of the form A“~* and all trees of successor height trivially have 
paths, but some trees have none. For example, fix y= and let T= 
{s € y~": s(0)>s(1)>s(2)>---}. Then T has height w but no paths, 
since there are no decreasing w-sequences of ordinals. For trees of height w 
we do have 


4.7. LEMMA (KGnig). If T has height w and every level of T is finite, then 
there is a path through T. 


ProoF. Pick Xo in level 0 of T such that {y © T: y > Xo} is infinite; this is 
possible since T is infinite and level 0 is finite. Now inductively pick x, 
(n € w) in level n of T so that for each n, X14: >x, and{y © T: y > Xasi} is 
infinite. Then {x,: n € w} is a path through T. O 


The obvious generalization of K6énig’s lemma to w, is false: 


4.8. THEOREM. There is a tree T of height w, such that every level of T is 
countable but there are no paths through T. 


Proor. Let S ={s Ew": s is 1-1}. S is a subtree of w<”. There are 
no paths through S since there are no 1-1 functions from @, to w. Of 
course, the levels of § are uncountable, so we shall use a suitable subtree of 
S. If s,tEw, say s~t iff {E: s(€) ¥ t(€)} is finite. O 


4.9. LEMMA. There is a sequence s, (a <,) from S such that whenever 
a<B, s. = Sala. 
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Assuming 4.9, T= U,..,{s €w7:s€S and s ~s,} satisfies 4.8. To 
prove 4.9, define s, by induction, making sure that w —(range(s.)) is 
infinite. Then s,., can be any 1—1 extension of s.. If y is a limit and s. 
(a < y) are given, let y, 7 y (n © w). By induction on n, find s), © S such 
that s/ ~s, andm<n—s/, =s/ | y,. Lett = U,s/.. Let s, € S be such 
that s,(€)= t(&) for €Z {y,: n © w} and w — range(s,) is infinite. O 


For any regular x, we call a k-Aronszajn tree a tree T of height x such 
that all levels of T have cardinality less than x and T has no paths. Then 
K6nig’s lemma says that there are no w-Aronszajn trees, whereas 4.8 
shows that there is an w,-Aronszajn tree. The argument of 4.8 easily 
generalizes to construct a «-Aronszajn tree whenever x = A“, A is regular, 
and for all cardinals 6 < A, 2° <A (in proving 4.9, make sure range (s,) is 
non-stationary). In particular, under GCH, there is a «-Aronszajn tree 
whenever «x is the successor of a regular cardinal. The situation for 
successors of singular cardinals is open under GCH$ although if V =L, 
there are x-Aronszajn trees for such x (Jensen — see Chapter B.5). It is 
consistent with c = w, that there are no w,-Aronszajn trees (see MITCHELL 
[1972]). «-Aronszajn trees for x strongly inaccessible are discussed in 
Section 6. 

Trees are very much related to total orders and the Sustin problem (see 
Section 3). In general, given a tree (T, <), we may obtain a (strict) total 
order <I of the points of T as follows. First, picture T written on the 
blackboard, rather than growing out of the ground. Then points on a given 
level are ordered from left to right. To compare points at different levels, 
we squash T down into the chalk bin, making sure that branches do not get 
squashed down above a point (Fig. 2). More formally, we call a total order 
<1 of T a squashing of < iff 

(a) If 6 and c are at the same level of T,b = d, c <g, and b <(c, then 
d<g, and 

(b) If a<c<e, then adc iff ade. 

Such a < is easy to construct by induction on the levels of T, but it is not 
unique; we are perfectly free to order the immediate successors of a point x 
any way we like, and the placement of x with respect to them is also 
completely arbitrary. We remark that if T is a sub-tree of some 2“*, then 
the lexicographic order is a squashing of it. 


4.10. THEOREM. Let «x be regular, (T, <) a x-Aronszajn tree, and <1 a 
squashing of T. Then there are no strictly increasing or strictly decreasing 
«-sequences in (T, <2). 
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Fig. 2. 


Proor. Say (x.: a <«) were an increasing sequence. For each <k, 
define a point y, on the é-th level as follows: Fix é, then fix @ such that for 
all B >a, xg is at a level >; for B >a, let z, be the element at level 
below xg; then (zs: a < B < x) is non-decreasing in <, and hence eventu- 
ally constant; let ye be this constant value. Now {y,: <x} is a path 
through T, a contradiction. O 


Suslin trees are very nice Aronszajn trees. A squashed Suslin tree is 
(with a bit of fudging) a Suslin line. If one starts with a Suslin line, the 
method of proof of the Cech—Pospi&il theorem yields a Suslin tree. 

More formally, an antichain in T is aset A C T such that whenever x, y 
are distinct elements of A, x%y and y<x. A «-Suslin tree is a k- 
Aronszajn tree with the property that T has no antichains of cardinality 
k. Then 


4.11. THEOREM. There exists an w,-Suslin tree iff there exists a Suslin line. 


Proor. First, let (T, <) be an w,-Suslin tree. It is sufficient to produce a 
c.c.c. dense total order which is not separable, since its Dedekind comple- 
tion (with the first and last elements removed) will then be a Suslin line. 

Now, if < is any squashing of (T, <), it almost works, except that it may 
be neither c.c.c. nor dense. To patch up density, call points x and y 
equivalent (x ~ y) iff there are at most countably many elements between 
them in <. Let L = T/~ with the natural order, which we call also <. Let 
[x] be the equivalence class of x. 


386 KUNEN/ COMBINATORICS [cu. B.3, 84 


L is clearly dense in itself. To see that L is c.c.c. suppose # were an 
uncountable family of disjoint open intervals. If ([x],[y]) € 4, then there 
are uncountably many elements between x and y in <, so we may 
inductively pick X., Ya, Ze (a < w;) so that ([x.],[yaJ) € ¥, Xe <I Za Iya, the 
level of z, in T is above those of x, and y,, and the levels of x, and y, in T 
are above those of z, for B <a. Then {z, : @ < w,} would be an uncount- 
able antichain in T. 

To show that L is not separable, let X C L be countable. Let a be above 
the levels of all elements of equivalence classes in X. If x is in the a-th 
level, [x] and all equivalence classes of points above x in T define the same 
Dedekind cut in X, so at most two of these can be in the closure of X. Thus, 
the closure of X is countable. 

The construction of the tree from the line is analogous to the proofs of 
4.1 and 4.3. Let (L, <) be a Suslin line. Let L'= L U{—~™, + ©}, ordered 
in the obvious way; then L’ is compact. Now, by induction on s € 2°” 
define closed (possibly degenerate) intervals, [a,b,]GL’ (a, = b,). 
[a.),b,J= L'=[-—~, +]. If a, = b, let [a,-o, bso] = [as1, b.1] = [as by]. 
If a, <b, let a.*1 = by be some point in (a,, b,), a.»0= a, and b.. = b,; 
i.e., we split [a,,b,] into two adjacent intervals. If lhs is a limit, let 
[a,, b,] = Mla sias bya]: a <Ihs}. 

Let T = {s € 2”: a, < b,}. Then T is a subtree of 2““, and we shall check 
that it is Suslin. First, for each p © L, we can let f, :w,—2 be chosen so 
that Va < w,(p © [4,14 b,,14)). Since there can be no w,-sequence of 
strictly nested intervals, there must be an @ such that a,;4= bp 1a= p; if a 
is least, let s, = f, | a. Then lh(s,) is a limit, and for é <lh(s,), s, fé € T. 
Since 


{p} = M"Vfasre Dept: é <ith Sp}; 


{a,: t€ T}U{b,: t€ T} is dense in L', and hence uncountable; so T is 
uncountable. However, each antichain (and hence each level) of T is 
countable, since if A is an antichain, the (a, b,) for s€ A are disjoint. 
Finally, T is Aronszajn, since a path through T would yield an w,;-sequence 
of strictly nested intervals in L'’. O 


By Theorems 3.8 and 4.11, © implies that there is an w,-Suslin tree. One 
may also construct the tree directly using ©. It is this direct construction 
which is more useful in generalizations to larger cardinals (see Section 11 of 
Chapter B.5). 

Although, in elementary topology, it is the line which has the most 
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obvious interest, the tree may be used directly to construct various 
topological spaces (see, e.g. RupDIN [1975]). 
For more on trees in general, see JEcH [1971]. 


5. Almost disjoint sets 


Obviously, w cannot be partitioned into more than w pairwise disjoint 
sets. However, if we call x,y Cw almost disjoint iff |x|=|y|=@ and 
|x Ny|<, then 


5.1. THEOREM. There is a family & C P(w) such that || =2° and the 
elements of & are pairwise almost disjoint. 


Proor. The idea is that any two paths through the complete binary tree of 
height w are almost disjoint. Thus, we shall take wf C A(2“*). If f E 2°, 
let ap ={ffn: n€ w}. Let o ={a,: fE2°}. , 


Generalizations to larger cardinals depend on the axioms of set theory. If 
x,y Cx, call x, y almost disjoint iff |x|=|y|=« but|x NM y|<-«. To repeat 
the above argument, we would need that the tree 2 < « has cardinality only 
x, which may be false. In fact, the existence of a family of 2” almost 
disjoint subsets of w, is consistent with and independent from ZFC + 2° = 
271 = w3. 

The method in 5.1 may actually be used to prove statements about the 
behavior of the cardinal exponentiation function at singular cardinals. It is 
consistent with ZFC that exp on regular cardinals be anything not patently 
absurd (e.g., exp(w) = @,,, exp (@1) = exp (w2) = .,,+35, etc. is consistent — 
see Easton [1970] or Theorem 5.9 of Chapter B.4, but at singular cardinals 
the situation is more obscure. It is open whether it is consistent that GCH 
first fails at w, (i.e. Wn < w [exp(@,) = @,.+:] but exp(w,) = w.+2). How- 
ever, this cannot happen at w,, — i.e.: 


5.2. THEOREM (Silver). If Va < w:(exp(w.) = @a+1), then exp(w.,) = @uy+1- 


Silver’s proof used metamathematical ideas outside the scope of this 
paper. We present a direct combinatorial proof due independently to 
Baumgartner, Jensen and Prikry. 

The appropriate generalization of 5.1 uses almost disjoint functions: 
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5.3. Lemma. Assume Va < w,(exp(w.) = @a+1). Then there is an FC 
(w,,)" such that 

(a) Vig F(fA# g > da < wo. VB > a (f(B)# 8(B))). 

(b) Vf E€ FVa < w, (f(a) < wisi). 

(c) | F| = exp(.,). 


Proor. For a < w,, fix a 1-1 map ¢, from A(,) into wo+1. If X C w.,, let 
f.(a) = ga(X Nw). Let ¥={f.: XCo.,}. OF 


We now attempt to bound | ¥|. If (b) were replaced by f(a)<,, a 
bound would result from the pressing -down lemma (2.3). More generally: 


5.4. LEMMA. Assume Wa < w,(exp(wa) = @a+1). Let ¥ satisfy 5.3 (a) plus 
(b’) Vf € F ({a: f(a) < w,} is stationary). 
Then | F| <.,. 


Proor. For fE¥, define f*:w,2, so that |f(a)|= wy ~a). Then 
{a: f*(a) < a} is stationary, so by 2.3 there is a B, < w, and a stationary 
S; Cw, with Va € S; (f*(a) = B,). Now there are only w, possibilities for B, 
and w, possibilities for S; Furthermore, for each fixed B and S, 


fe F: By, = B& S, = S}| = wpr2, 


since the f | S are distinct (by (a)) functions into wg.1. Thus, |¥|=o.,. O 
Lemma 5.4 may be generalized to: 


5.5. LEMMA. Assume Wa < w,(exp(w,) = @a+1). Let g:w:—w,, be such 
that Wa (g(a) < @a+1). Let ¥ satisfy 5.3 (a) plus 

(b") Vf E F¥ ({a: f(a) < g(a)} is stationary). 
Then | F¥|<w.,. 


Proor. For a < @,, fix a 1-1 map w& from g(a) into w,. Let f’(a) be 
Wa(f(a)) when f(a) < g(a) and f(a) when f(a) = g(a), and apply 5.4 to 
{f': fEF}. O 


Now, let ¥ be as in 5.3. We complete the proof of 5.2 by showing 
|¥|=@.,+1. For g and f distinct members of ¥, we say g<f a.e. iff 
{a: f(a) < g(a)} is not stationary —i.e., {a: g(a) < f(a)} contains ac.u.b. 
subset. Then 5.5 says that for any g € ¥, g <f a.e. for all but at most o., 
members f of ¥. We may thus inductively pick f, € ¥ for wp < w.,+: So that 
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u<v—of, <f. ae. Let F% ={gEF: g<f, ae}. By 5.5, UF: p< 
w.,+1} = F, but again by 5.5, each |¥,|so.,,. Thus |F{= oe... O 


Theorem 5.2 easily generalizes to: 


5.6. THEOREM. If w <cf(A)<A and {x <A: exp(k) =} is stationary in 
A, then exp(A)=A". 


For further generalizations, see GALVIN and Hasnat [1975]. 

A concept related to almost disjoint families is that of quasi-disjointness. 
A family sf of sets is quasi-disjoint, or forms a A-system (see Fig. 3) iff 
there is a fixed b such that aM a’=b whenever a and a’ are distinct 
members of &. b is called the root of &. The following ‘“‘A-system lemma” 
is frequently used in topology and combinatorics. 


- x 
a, a, 
Fig. 3. 


5.7. THEOREM. If of is an uncountable family of finite sets, there is an 
uncountable 8 Co which is quasi- disjoint. 


Proor. Since there are only w finite cardinals, we may assume that all the 
members of & have cardinality n for some fixed n. We proceed by 
induction on n. If n =1, 5.7 is trivial. If n =m+1 where m =1, we 
consider two cases. 

Case I. Some element p is contained in uncountably many members of 
sf. Apply the inductive hypothesis to {a —{p}: a € & and p € a}. 
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Case II. Not case I. Inductively pick a, € # (a < w,) so that a, N a, = 
0 for all E<a. Let B={a:a<ow}. O 


For generalizations of 5.7 to other cardinalities, see JuHAsz [1971]. 

We present an elementary topological application of 5.7. A space X is 
called c.c.c. iff there are no uncountable families of pairwise disjoint open 
sets in X. So R” isc.c.c. for all n. A Suslin line (if it exists — see Section 3) is 
c.c.c. but its square is not. It is consistent with set theory that any product 
of c.c.c. spaces is c.c.c. (see Chapter B.6). The following shows that, 
regardless of the axioms of set theory, it is sufficient to look at finite 
products; it implies immediately that R* is c.c.c. for all x. 


§.8. THEoREM. If X, (i € I) are spaces and [[je,X; is not c.c.c., then there is 
a‘finite b CI with Wie, X; not c.c.c. 


Proor. Say V. (a <w,) are pairwise disjoint basic open sets in the 
product. Each V,, depends only on a finite set of coordinates, a, CI. By 5.7, 
let T be an uncountable subset of w, such that {a,: a € T} form a A-system 
with root b. Let V* be the projection of V. in I,e,X;. Then the Vi (a € T) 
are disjoint, so I],-,X; is not c.c.c. O 


6. Partition calculus 


Intuitively, a graph is a set of points in space, some of which are 
connected by lines (see Fig. 4). Formally, let [I]? = {{x, y}: x,y E L&x#F y}. 
Then a graph is a pair, (J,@) where € C[J]’. Thus, in Fig. 4A, I = 4 and 
@ = {{0, 1}, {1, 3}, {1, 2}}. UI’, S’) is a subgraph of (U, @) iff I'C I and 
@'= € M[I'[; so graph A is a subgraph of graph C. The complete graph on 
I is (1, [I]’). The empty graph on I is (J,0). 

A special case of the finite version of Ramsey’s theorem is that if a graph 
has 6 or more points, then it either has a 3-point empty subgraph (e.g. 
{1, 4, 5} in C) or a 3-point complete subgraph (e.g. {0, 1,2} in D). Graph B 
shows that this can fail for graphs with only 5 points. The reader is invited 
to construct a proof that 6 is large enough. 

More generally, Ramsey showed that for each j € », there is an i€ w 
such that whenever (J, @) has at least i points, it has either a complete or an 
empty subgraph with j points. Call R(j) the least i that works; so R(2) =2 
and R(3)=6. Unfortunately, there is no nice formula for R{j). Fortu- 
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nately, however, infinite combinatorics is much simpler than finite com- 
binatorics; R(w) is exactly w and R of most other infinite cardinals can be 
computed explicitly in terms of cardinal arithmetic. 

The above notions may be generalized in two ways. 

First, if o is a cardinal, a function P :[I]’— a is called a partition of [I]? 
into o pieces. We may think of P as coloring each of the edges of the 
complete graph on I in one of o possible colors. Graphs then in the 
previous sense may be thought of as partitions of [I]? into two pieces; e.g. 
in Fig. 4A, the edges {0, 1}, {1, 3}, {1,2} are colored black and the other 
three edges are colored white (so they do not show). 

Second, we may consider [I]"={FC1I:|F|=n} and_ partitions 
P:[I]'> a. 

If P:[I]"— 0, we call H CI homogeneous for P iff P is constant on 
[H]". The Erdés notation x > (A)% is used to abbreviate the assertion: 
Whenever P :[x]" — a, there is an H C x homogeneous for P of cardinal- 
ity A. So, e.g., 6— (3)3 but 5 (3)3. Note that if «x >(A)3, K'=K, A'SA, 
o' <a, and n'<n, then x'>(A')2. 

The finite version of Ramsey’s theorem, which we shall not prove, is that 
for each n, a, A € o, there is a k € w such that x > (A)>. The question of 
how small a « will work is still a subject of much research (see ERpD6s, 
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Hasna_ and Rabo [1965]). In the following, x and A will always be infinite 
cardinals. Incidentally, ERpdés and Rapo [1952] showed that for all x, 
k %(w@)%, so n will always be finite (if one drops the axiom of choice, 
relations like « —> (A)? may be consistent — see, e.g., KLEINBERG [1970]). 
We shall show that for any n, o, A there is a « such that x > (A)¢, and in 
many cases produce a formula giving the least such «. Our first result is 


6.1. THEOREM (Ramsey). w —(w)% for any n,o Ew. 


The first non-trivial case of 6.1 is w > (w);, the fact stated above for 
graphs. We defer the proof of 6.1 until after we discuss the situation for 
larger cardinals, since a number of related results can be proved by the 
same method. 

If one replaces w by w, in 6.1, the theorem becomes false by either (a) or 
(b) of the following (with « = ). 


6.2. THEOREM. For any « 
(a) 2° A (3), 
(b) 2° A(k*);. 


Proor. For (a), identify 2" with the set of functions from « into 2, and 
define P({f, g}) to be the least a such that f(a) # g(a). 
For (b), it is convenient to prove first: 


6.3. Lemma. If x > (A)3 and (L, <) is a total ordering of cardinality x, then 
there is either an increasing or a decreasing X-sequence in L. 


Proor. Let < be any well-ordering of L, and define P:[L]’—>2 by 
P({x, y}) = 0 iff < and < agree on {x, y}. Then a homogeneous set for P of 
size A gives the desired sequence. 

To prove (b) from the lemma, it is sufficient to produce an order of 
cardinality 2“ with no increasing or decreasing « *-sequences. If k = w, the 
real numbers will do. In general, let L be the set of functions from x into 2 
and order L lexicographically; that this has the desired property may be 
proved exactly like 4.10. O 


If one goes to (2")*, one gets positive results in 6.2 — i.e., (2")* > (k*)i. 
More generally, let expo(k) = K; expn+i(k) = 2%?" Then: 


6.4. THEOREM (ErDés and Rapo [1956]). For every x and all n=1, 
(expn-s(«))” > (x "Je. 
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We shall prove 6.4 later, but we state without proof the assertion that it is 
the best possible. A proof may be extracted from Erpoés, Hasnat and 
Rapo [1965]. 


6.5. THEOREM. For all n = 1, 
(a) expr-1(K) A (K"):, 
(b) expn_(Kk) A (n + 2). 


By 6.4, for any n, 0, A there is a « such that x (A). For some cases — 
e.g. when A is a successor and o < A, 6.4 and 6.5 give the best such «x. One 
can establish the best « for most other cases (see Erpds, HAJNAL and Rapo 
[1965]), but the results seem too tedious to state here. A more interesting 
question is whether x can be A. The first non-trivial possibility, x > («)3, 
leads immediately to larger cardinals. 


6.6. THEOREM. If xk >(k)3; and x >, then x is strongly inaccessible. 


Proor. By 6.2(b), if A<«, then 2*(A*);, whereas since A* =k, 
k —(A‘}; thus, 2* <x. If « were not regular, write k = 2 ee Ag, where 
the A, are disjoint, a <x, and each |A,|< x. Define P :[«]’>—2 so that 
P({x, y})=0 iff x and y are in the same A, Then P could have no 
homogeneous set of size x. O 


In fact, x >(«)} implies that x is the «-th strong inaccessible (see 
Section 7). 

There now comes to mind a whole spectrum of properties: x > (x)}, 
k —(k)3, etc. Fortunately, these are all the same. 


6.7. THEOREM. If x >(«)j, then x >(x)3 for alln <w and alla <x. 


We now owe the reader three proofs: 6.1, 6.4, and 6.7. We have been 
postponing them because they are all so similar they can just as well be 
done simultaneously. The similarity is in four respects. First, they are all 
proved by induction on n. Second, they are all trivial for n = 1. 

Third, they all use the concept of a pre-homogeneous (p.h.) set in the 
induction step. If P :[@]"*'—~ o and H C@, we call H p.h. for P iff P on 
[H]"*' does not depend on the last element of an (n+ 1)-tuple — i.e. 
P({x1,...,%» y}) = P({x1,...,%, Z}) whenever x,<x.<-+++<x,, Xn <y, 
X,<Z, and x1,...,X, y,2 € H. If H is p.h. for P and H has no last 
element, then we can define Q:[{H]"—o by letting O({x,..., x.) = 
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P({x,,...,X» y}) for some (any) y € H larger than x,,...,x,. If K CA is 
homogeneous for Q, then K is homogeneous for P. Thus, the existence of a 
large p.h. set, plus an induetive assumption about partitions of n-tuples will 
imply a partition relation for n + 1-tuples. More precisely, Theorems 6.1, 
6.4, and 6.7 follow by induction from parts (a), (b), and (c) respectively of 


6.8. Lemma. Let n=1. 

(a) If P :[w]"*'— o, where o < a, then there is an infinite p.h. set for P. 

(b) If P:[0]"*'—o, and 6>0a%*, then there is a p.h. set for P of 
cardinality X. 

(c) If P:[«]"*’— o, where o < x and x — (x), then there is a p.h. set for 
P of cardinality x. 


Note that in proving 6.4, (b) is used for 0 = (exp, («))*, A = (expn_-:(«))", 
and o = x; one assumes inductively that (exp,-:(«))” > («*)z. (b) is useful 
in deriving other partition relations not covered by aur stated theorems; 
for example, if « is strongly inaccessible, then x*—>(«)2 for any o<x 
(take 0 = x*, A = x). It is probably better to remember the proof of 6.8 (to 
follow), rather than a collection of seemingly unrelated arrow relations. 

The fourth similarity in our proofs is that the desired p.h. set will arise 
from a path through a tree. 6.8 (a) will use K6nig’s lemma (4.7) that there 
are no w-Aronszajn trees. 6.8(b) will use a direct cardinality argument. 
6.8(c) will use K6nig’s lemma at x, namely: 


6.9. LEMMA. If « >(«)}, there are no x-Aronszajn trees. 


Proor. By Theorem 4.10, a k-Aronszajn tree may be squashed to a total 
ordering of size x with no increasing or decreasing x-sequences. But then 
by 6.3, kA(kk. O 


We now prove Lemma 6.8 for the case n = 1. The general situation is: 
we are given a partition P :[6]’—> o and we wish to produce a p.h. set of 
size A. We shall define a subtree T of the complete o-ary tree o~* anda 
function h: T—@ in such a way that if f:A—o is a path through T, 
{h(f } €): € <A} will be a p.h. set for P, enumerated in increasing order; 
further, the Q defined above 6.8 will be given by f (i.e., O({A(flTEP= 
f(€)). So T is the tree of potential Q’s and h assigns potential p.h. sets to 
members of T. 

More formally, define, for all s € 7~*, h(s) € 6 and A(s)C @ so that: 


(i) A( ))= 6. 
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(ii) h(s) is the least element of A(s) if A(s)40; if A(s)=0, say 
h(s) =0 (this will be irrelevant). 

(iii) If Ih(s) is a limit, A(s) = M{A (st é): € <Ih(s)}. 

(iv) A(s*u)= A(s)N{y >h(s): P{h(s), y}) = 4}. 

Let T ={s: A(s) #0}. Now if f:A—-o is such that VE<A (f[&€T), 
then for all <9 <A, hA(fl n)E AFI (E+ 1), so PHATE) AGI a) = 
f(); thus {h(f | €): & < A} will be p.h. as desired. To see that there is such 
an f, note that if y <6 is not equal to A(s) for any s€ o~’*, one may 
define an f inductively so that y € A(f[ &) for all € — this can be done 
since the only element of A(s) not in U{A(s*n): « <a} is h(s). Thus, we 
are done when the cardinal o~* is less than 6, so 6.8(b) is proved (for 
n = 1). In the case of 6.8(a) and (c), @ is inaccessible or w and A = @, so 
a~* = 0, but o** < @ for all a <6; our argument now only shows that 
the tree has height 0, since for a < 0, we may fix a y not equal to h(s) for 
any s of length < a, and produce an f of length a with y € A(f). Thus, if 
there are no paths through T, T would be a 6-Aronszajn tree, so 6.8(a) and 
(c) (for n = 1) follows from KGnig’s lemma and 6.9 respectively. 

To prove 6.8 for arbitrary n =1, we work not with o**, but with the 
tree of potential partitions of n-tuples. Let, for €<A, Sf= 
{s: s:[€]" > o}. Then Sz is the €-th level of the tree S$" = U{St: <A}, 
where s <¢ iff s Ct. So, for n = 1,S$" may be identified with o~*. Given 
P:[6]"*'— 9, define, for s ES", h(s)€ @ and A(s)C 6 so that 

(i) A (0) = 86. 

(ii) h(s) is the least element of A(s) if A(s)40; if A(s)=0 let 
h(s)=0. 

(iii) If s E $* for limit n, A(s) = M{s t[E]": & < y}. 

(iv) If sES¢, t€ Ses, and s St, 


A(t)= A(s)N{y > A(s): WFE[E + 1]" (P(A(F) U {y}) = (F))}. 
In (iv), h(F) means {h(t [,]"), h(t t[n2]"),...,4(¢0 [7.]")}, where F = 


{n.---7,}. With these definitions, the proof for arbitrary n proceeds as 
before. 0 


We have thus proved 6.8, and hence Theorems 6.1, 6.4, and 6.7. A 
review of our argument for 6.7 shows that we have in fact established: 


6.10. THEOREM. Let k > w. Then the following are equivalent: 
(i) Kk > (x). 
(ii) « > («)o for alln <w and alla <k. 
(iii) « is strongly inaccessible and there are no x-Aronszajn trees. 
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(iv) Whenever (L, <) is a total ordering of size x, there is either an 
increasing or a decreasing x-sequence in L. 


Proof. (ii)— (i) is trivial. (i)— (iv) is Lemma 6.3. The proof of 6.9 
establishes (iv)— (iii) if we can see that (iv) implies that « is strongly 
inaccessible. To see this, look back at the proof of 6.6. The proof that A < x 
implies 2" < « is unchanged. If x were not regular, we could, following the 
notation of 6.6, define an order <J on x by saying x <1 y iff either x < y and 
x,y are in the same A, or x € A, y © A, where & > 7. Then there could 
be no increasing or decreasing x -sequences in (x, <1). Finally, our proof of 
6.7 establishes (iii) > (i). O 


The text Erpés, HasnAL, MATE and Rapo [~1977] contains x more 
theorems on the partition calculus. 


7. Large cardinals 


Mathematicians and other children often play the following game: We 
take turns naming numbers, and see who can name the largest one. This is 
a game in the psychological rather than in the formal sense, since I might 
always just add one to your number, but my goal is to try to completely 
demolish your ego by transcending your number via some completely new 
principle. Thus, a sequence of plays might be: 


me: XVII, 
you: 1,295,387, 
me: 10°”, 
you: o, 


Mme: Wie,)> 


The next stages in the game, which are the subject of this section, go 
through various inaccessibility properties. Here we must distinguish be- 
tween the weak and the strong properties. For example, « is a (weak) limit 
cardinal iff A* < « whenever A < x; it is a strong limit iff 2* < « whenever 
A <k«. « is weakly inaccessible iff it is a regular limit cardinal > w, and 
strongly inaccessible iff it is a regular strong limit cardinal > w. Under 
GCH the weak and strong properties are equivalent, but it is consistent 
that there are weak inaccessibles < 2°. 
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After you have played the first inaccessible, it would not take much 
imagination on my part to play the second, so instead I play the first 
hyper-inaccessible. Here, x is (weakly/strongly) hyper-inaccessible iff « is 
regular and a limit of (weak/strong) inaccessibles. The notions of hyper- 
hyper-inaccessible, etc., are then obvious, so your honor demands that you 
play the first Mahlo cardinal. 

We call x weakly Mahlo iff x > w, « is regular, and the regular cardinals 
<« are stationary in x. x is strongly Mahlo iff x is weakly Mahlo and 
strongly inaccessible. 


7.1. Lemma. If « is (weakly/strongly) Mahlo, then {a<x:a is 
(weakly /strongly) inaccessible} is stationary in x. 


Proor. Note that {a < x: @ is (weak/strong) limit cardinal} is c.u.b. in x, 
and that the intersection of a c.u.b. set and a stationary set is 
stationary. O 


It is now easy to see that in 7.1 we may replace “‘inaccessible’’ by 
“hyper-inaccessible”’ or ‘“‘hyper-hyper-inaccessible.”’ 

There is now a notion of hyper-Mahlo obtained by demanding that the 
set of Mahlo cardinals below « be stationary in x, and there is the obvious 
extension to hyper-hyper-Mahlo, etc., so I shall utterly annihilate you by 
naming the first weakly compact cardinal. 

k is called weakly compact iff x > and x >(x);. In 6.10, we proved 
various combinatorial equivalences to this notion. It is also equivalent to 
various statements in logic involving infinite formulas of length < «x (see, 
e.g., BARWISE [1975]). 

It is not immediate that weak compactness is a large cardinal property, 
although we know (6.6) that it does imply strong (sic) inaccessibility 
(unfortunately, ‘‘strongly compact” has been used for another notion — 
see the Appendix or Drake [1974]). However: 


7.2, THEOREM. If «x is weakly compact, then x is strongly hyper-hyper- 
Mahlo. 


It is convenient to first prove: 


7.3. THEOREM. If « is weakly compact and S any stationary subset of x, then 
there is a regular’ <« such that SQ A is stationary in 4X. 
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To deduce 7.2 from 7.3, first apply 7.3 with S any c.u.b. subset of « to 
obtain a regular A € S; so « is strongly Mahlo. Now, let S;={A <x: A is 
strongly inaccessible}. If C is any c.u.b. subset of x, we may apply 7.3 with 
S=S,NC to get a regular AE C with S,NA stationary in A (i.e. A is 
strongly Mahlo), so the set of strong Mahlo cardinals is stationary in x. We 
may now replace S, by S:={A <«: A is strongly Mahlo} in the previous 
sentence to obtain that « is strongly hyper-hyper-Mahlo. 

To prove 7.3, we may first assume that S contains only infinite cardinals. 
Let f:« —« be the 1-1 increasing function that enumerates S. Let 


T ={s Ex": WE <th(s)[s(€) < f(€)] and s is 1 — 1}. 


T is a sub-tree of x *". If T had height x, T would be a «-Aronszajn tree, 
since a path through T would yield a 1-1 pressing-down function on S, 
contradicting 2.3. However, by 6.10, there are no «-Aronszajn trees, so 7.3 
follows from: 


7.4, LemMA. Let A be any set of infinite cardinals such that for all regular 
A, ANA is not stationary in ». Then there is a 1-1 function g on A with 
VaEA[G(a)<al]. 


Proor. Let y = sup(A) and assume 7.4 holds for all A’ with smaller sup. 
Then we have three cases: 

Case I. y is a successor cardinal. Trivial. 

Case Il. y is singular: Let 6 = cf(y) and let 5, (u < @) be a continu- 
ously increasing cofinal sequence in y, with each 6, a cardinal and 5) > 6. 
Let g,:AM65, > 6, satisfy 7.4. Define g:A—y by 


Bo(a)+1 if a <8, 
g(a)= 4} &+e.4(a) if &<a< bu, 
wp if a = 6,. 


Case Ill. y is a regular limit cardinal. Proceed as in Case II (with 
@ = y), but make sure that 6,¢ A for all yp. 

It might be suspected that the conclusion of 7.3 is equivalent to weak 
compactness, but this turns out to be both consistent with Jensen — see 
Theorem 14.2 of Chapter B.5), and independent from (KuNEN [ ~ 1977]) 
ZFC + GCH (see Section 3 of Chapter B.3). 

Weakly compact cardinals are the beginning, not the end, of large 
cardinal theory. This game has been played to much greater heights (see 
SILVER [1973], SHOENFIELD [1971], SoLovay, REINHARDT and KANAMORI 
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[~ 1977]). It should be pointed out that it is cheating to name a number 
that does not exist. At times in the past, plausible Jarge cardinal assump- 
tions have turned out to be inconsistent with ZFC (KuNEN [1971]). Weakly 
compact cardinals are not known to be inconsistent with ZFC. But maybe 
ZFC is inconsistent. 


Appendix*. Ridiculously large cardinals 


Given an interesting property P of cardinal numbers xk, we investigate 
whether or not P(«) holds for some, most, or all cardinals «x. The results 
discussed in this chapter all take this form, where P is a combinatorial 
property. Sometimes we discover that a particular property P(«), like 
k —(x)s, is extremely rare, in that the least «x such that P(x) holds is 
incredibly large, so large in fact that we cannot even prove, from the usual 
axioms of set theory, that there is a x such that P(x). In this case the 
statement “there is a cardinal «x such that P(«)’ is called a large cardinal 
hypothesis, at least until it is refuted. 

All of the large cardinal hypotheses discussed in Section 7 of this chapter 
can be given more or less convincing justifications along the lines discussed 
at the end of Chapter B.1 — convincing enough, at any rate, that almost no 
one expects them to be refuted. 

There are other, more problematic, large cardinal hypotheses. For these 
there is as yet no convincing justification. Far from being refuted, however, 
these hypotheses have led to some extremely beautiful mathematics. This 
work has been very well described in the literature, so we shall only 
indicate the directions taken, and point to the appropriate references in the 
references below. 

Two large cardinal hypotheses which can be directly motivated as 
strengthenings of weak compactness are the Ramsey cardinal and the 
ineffable cardinal hypotheses. By x — (A)z° we mean that whenever we are 
given (P,: n © w) with P,: [«]"—o, there is an H Cx of cardinality A 
which is homogeneous for each P,. « is a Ramsey cardinal iff k > («)z°; 
this is equivalent to x > («)5° for all a < x. By « — (stat)? we mean that 
whenever P :[x]" — , there is a stationary homogeneous set for P. « is 
ineffable iff x — (stat)}; this is equivalent to x — (stat); for all o < x, but is 
strictly weaker than « — (stat); (see BAUMGARTNER [1975]). 

Obviously, « being either ineffable or Ramsey implies that is weakly 
compact; but it also implies that the set of weakly compact cardinals below 


* Appended by the editor and set-theory coordinator. 
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« is a stationary subset of «. The first Ramsey cardinal is not ineffable, but 
the ineffable cardinals are stationary below any Ramsey cardinal. 

The existence of an ineffable cardinal does not refute Gédel’s axiom of 
constructibility (V = L), but the existence of a Ramsey cardinal implies that 
P(w)ML is countable (see Section 6 of Chapter A.5, and SitverR [1973]). 
It also implies that every %3 set of reals is Lebesgue measurable, whereas 
this statement is false in L (see Chapter C.8). 

A still larger cardinal property, measurability, was motivated historically 
by the seemingly unrelated question of whether an ultrafilter can be 
countably complete. x >w is called measurable iff there is a x-complete 
non-principal ultrafilter on x. The existence of a measurable cardinal is 
equivalent to the existence of a countably complete non-principal ultrafil- 
ter on any set. If « is measurable, x is both Ramsey and ineffable, and the 
set of cardinals with both these properties is stationary below k. 

In Chapter A.3, there is a discussion of ultrapowers. Many of the results 
on measurable cardinals are proved by the method of Scott, which is to use 
the ultrafilter on « to take an ultrapower of the whole universe of sets. The 
exposition of this given in SHOENFIELD [1971] cannot be improved upon. 

Still larger cardinal axioms are obtained by postulating ultrafilters of 
specific types. For example, « is strongly compact iff every x-complete 
filter on any set can be extended to a «x-complete ultrafilter. SoLovay 
[1974] has shown that if « is strongly compact, then the GCH holds at all 
singular strong limit cardinals above x. 

It is known (Lévy and Sotovay [1967]) that the existence of a large 
cardinal cannot imply either CH or not CH. It is open whether cardinals 
like strongly compact or above imply anything interesting about descriptive 
set theory-like, e.g., measurability of 23 sets of reals. 
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Introduction 


COHEN [1966] proved that adding the negation CH of the Continuum 
Hypothesis (CH) to the axioms of set theory does not lead to contradiction, 
unless the axioms themselves are already contradictory. In logicians’ 
jargon, CH is consistent relative to the axioms of set theory. Since all 
classical mathematics follows from these axioms, and there is little chance 
of finding a contradiction in them, Cohen’s work rules out a refutation of 
CH (a proof of CH, that is) by ordinary mathematical means. Since 
GOpEL [1940] had earlier ruled out a disproof of CH (this is discussed in 
Chapter B.5 on constructibility), neither a positive nor a negative solution 
to the continuum problem can be given in classical mathematics. 

Cohen’s method, called forcing, has since been applied to prove (rela- 
tive) consistency for hypotheses in transfinite arithmetic, infinitary com- 
binatorics, general topology, measure theory, topology of the real line, 
universal algebra, and model theory. We hope to make the method 
accessible to readers innocent of logic and equipped with a minimum of set 
theory. (HALMos [1960] suffices.) We will illustrate the method by proving 
consistency for some principles used elsewhere in this volume. 

By “‘the’’ axioms of set theory we mean the Zermelo—Fraenkel axioms 
including Choice (ZFC). (See Chapter B.1. ZFC is essentially what Halmos 
uses.) Exact knowledge of the axioms is unnecessary. Trust that all classical 
mathematics following from them is necessary. The main definitions 
pertaining to cardinal and ordinal numbers are collected in an appendix to 
this section. 

Non-Euclidean geometry was proved free from contradiction by exhibit- 
ing models for it. Likewise consistency proofs by forcing involve models. 
The definition of a model of ZFC requires some preliminaries. 


0.1. Formal symbols for set theory 
We introduce some formal symbolism for writing about sets: 
Variables for sets x,y, z,... 
Logical signs | (not), 
A (and), v (or), 
— (if..., then...), = (iff) 
Quantifiers V (for every), d (there exists) 
Identity sign = 
Membership sign € 
plus parentheses for punctuation. 
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Examples. In this symbolism we express y C x (y is a subset of x) thus: 
(i) Vz(zEG€y>zeEx). 


We can then express w = P(x) (the power set or set of all subsets of x) 
thus: 


(ii) Vy(yEwevz(zE€y>zeEx)). 
Finally the assertion that for all x, P(x) exists, is expressed thus: 
(iii) Vx dwVy(y Ewevz(z € yz Ex)). 


This is one of the axioms of ZFC. 


0.2. DeFIniTion (Syntax). A sequence of formal symbols which corre- 
sponds to an English sentence, and not a mere phrase or jumble of words, 
is a (well-formed) formula. Thus 0.1(i}-(iii) are formulas, while the follow- 
ing are not: 


Vz(zE€y—> and vax)y=.. 


An occurrence of a variable x in a formula ¢ is free if it is not part of a 
clause of » beginning with Vx or dx. Thus in 0.1(i), x, y are free. In 0.1(ii), 
x, w are free and y, z are not. 0.1(iii) has no free variables. Such formulas 
are called closed formulas, or sentences. (For more rigorous definitions, see 
Sections 2 and 3 of Ch. A.1.) Section 1 will provide more examples of 
translating English into formal symbolism. We hope readers will trust that 
even someting like CH can be expressed by a closed formula. 


0.3. DEFINITION (Semantics). A set a is transitive if whenever b € a and 
c € b we have c € a; equivalently, if every element of a is a subset of a. If 
M is a transitive set, g a closed formula, we say ¢ is true inside M, and 
write M F 9, if ¢ comes out true when we read the quantifiers Vx, 4x in » 
as meaning not literally ‘for all sets x’’ and ‘‘for some set x” but rather 
“for all x € M” and “for some x € M”’. If g(x,---x,) has free variables 


X,°'+'X, and a,--:a, € M, then ME ¢(a,-:-a,) if ¢ comes out true when 
quantifiers are read as above and x,::-x, taken to stand for a,-:-a, 
respectively. 


Example. Let g(x,y) be the formula 0.1(i) expressing y Cx. Then 
MF ¢(a, b) iff for every c € M, c € b implies c € a. Since by transitivity 
M contains allc € b, this amounts to saying all c € b have c €a,i.e.b Ca. 
In this case MF ¢(a,b) iff y(a,b) is in fact true. Let p(x,w) be 
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Vy (y €w@ g(x, y)), which expresses that w = P(x). MF (a, 2) iff for 
all bE M we have: bed iff MF ¢(a, b), ie. iff b Ca. Since all bE d 
belong to M, this means all b € d are subsets of a, and all subsets of a 
which belong to M are in d, i.e. d = P(a)M M. Since not all subsets of a 
need belong to M, we can have MF (a, d) without (a, d) really being 
true, ie. d= P(a)M M# Y(a). The power set axiom: Wx Jw u(x, w) is 
true inside M if whenever a €M, there is bE M with ME d(a, b); 
equivalently, if a €& M implies P(a)N ME M. 

A model of ZFC is a transitive set inside which all the axioms of ZFC are 
true. A countable, transitive model of ZFC is called a CTM. We henceforth 
assume CTM’s exist. We show that this assumption implies the existence of 
a CTM M with MF CH. By any of several tricks known to logicians, a 
proof of this implication can be turned into a proof that if there is no 
contradiction in ZFC, then there is none in ZFC+ CH, the relative 
consistency result we mentioned at the beginning of this section. [Hint: 
apply the Compactness, Reflection, and L6wenheim-Skolem theorem, or 
see COHEN [1966] Chapter IV, Section 11.] 

It may seem (to quote SKoLEm [1922]) ‘“‘a peculiar and apparently 
paradoxical state of affairs” that ZFC, in which we can prove the existence 
of enormous uncountable sets, could have a countable model. After all, 
any element a of a CTM M will be a subset of M, and so like M will be 
countable. However (quoting Skolem again) “there is no contradiction at 
all if a set a... is non-denumerable in the sense [of M]; for this only means 
that within M there is no one-to-one mapping ® of a onto... the [natural] 
number sequence’’, which does not prevent such a ® from existing 
outside M. 

It is natural to abuse language by speaking of an English sentence being 
true inside a model when the corresponding formula is. In case we have 
two “‘translations’’ ¢, # of the same English sentence, then if the transla- 
tions are at all reasonable, the equivalence y <> & should be a theorem, 
and hence true inside any CTM M. Thus MF ¢ iff MF y, and it does not 
matter which formula we think of as ‘“‘the”’ official translation. We already 
used this abuse of language above in speaking of a © M_ being non- 
denumerable in the sense of M. 


Appendix. Some definitions 


See also Chapter B.1, and (for CUB sets and trees) Chapter B.3. 
Ordinals may be characterized either as those transitive sets all of whose 
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elements are transitive, or as those transitive sets whose elements are 
linearly ordered by the €-relation. We reserve lower case Greek letters 
except 9, W, y,@ for ordinals. a < B iff a € B. Thus a ={B: B < a}. The 
immediate successor a +1 of a is a U{a}. An ordinal not 0 or an 
immediate successor of any other is called a limit. Lim(a@) means that a isa 
limit. The least limit is called w and its elements are the finite ordinals. The 
supremum of a set X of ordinals, sup X, is the least ordinal = every 
element of X. This is the same thing as UU X, the set of all elements of 
elements of X. A wellordering of a set X is a linear ordering of X in which 
every nonempty subset of X has a least element. The € -relation provides 
a natural wellordering on any ordinal a. In fact any wellordering is 
isomorphic to the natural ordering on some unique ordinal, the (order) type 
of the wellordering. 

The cardinality card X of a set X is the least ordinal « such that there 
exists a bijection between X and x. A cardinal is an ordinal x with 
card « =x. If « is a cardinal, 2“ or exp.(«) is card A(x). x* is the least 
cardinal > k. , is w*, w2 is w;, etc. The Continuum Hypothesis (CH) is the 
proposition 2” = w,. The Generalized Continuum Hypothesis (GCH) is the 
proposition that 2“ = «* for all infinite cardinals x. If a is a limit ordinal, its 
cofinality cfa is the least A such that there is a function f with domain A 
and sup range f.= a. If « is a cardinal, cf is the least A such that x is the 
union of A sets each of cardinality <x. « is regular if cf x = x, otherwise 
is singular. Cardinals of form «x* are successor cardinals, other cardinals are 
limit cardinals. « is a strong limit if 2* < A for all A < x. w and all successor 
cardinals are regular. A regular strong limit cardinal is said to be (strongly) 
inaccessible. 

Let @ be a limit ordinal. A Ca is unbounded in a if supA =a. A is 
closed in w if whenever B < a andsup(A O B)= 8B, then B € A. A is CUB 
in a if A is both closed and unbounded in a. A is stationary in @ if 
A 1 C# @ for every CUB C C a. These notions trivialize when cf a = w. If 
cf a > w, then an intersection of < cfa CUB sets is CUB, the intersection 
of a stationary and a CUB set is stationary, etc. (See Section 2 of Chap- 
ter B.3.) 

A relation = partially orders a set X if = is reflexive, transitive, and 
antisymmetric. A tree is a pair T =(|T|,<7), where <7 partially orders 
|T|, there is a <;7 least element, and for any a€|T|, the set of 
=,-predecessors of a is wellordered by <7. The type of this wellordering is 
the rank of a. The a-th level of T is T, = {a: rank a = a}. The height of T 
is the least a with T, =. A branch through T is a B C| T| such that any 
two elements of B are <;-comparable, and B contains one element from 
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T. for each a < height T. We usually identify T and | T|. (See Section 4 of 
Chapter B.3.) 


1. A closer look at models of set theory 


We have seen that the formula ¢ (x, y) expressing that y C x enjoys the 
following property: If M is a CTM and a,bE€M, then MF ¢(a, b) iff 
y(a, b) is in fact true. Such a formula we call absolute. A nonabsolute 
formula is the one w(x, y) expressing that y = A(x). We say an English 
expression is absolute when the corresponding formula is. 


1.1. LEMMA. The following are absolute: 
(i) y = Ux, the set of all elements of elements of x; 
(ii) z=xNy (orx Uy); 
(iii) z = {x, y}, unordered pair; 
(iv) z = (x,y), ordered pair; 
(v) z=x Xy, Cartesian product; 
(vi) z is a function; 
(vii) (z is a binary relation) n x =domz, the set of 1st components of 
elements of z (or x = range z, the set of 2nd components, or x C dom z, etc.); 
(viii) z is an injection (or surjection, or bijection) from x to y; 
(ix) x is transitive ; 
(x) OR(x), x is an ordinal; 
(xi) Lim(x), x is a limit ordinal (or x is a successor, x is a finite ordinal, x 
is the ordinal w); 
(xii) Lim(x) a y GC x a y is unbounded (or closed, or CUB) in x; 
(xiii) y is a reflexive (or transitive, antisymmetric, etc.) binary relation on x. 


Proor. See Lemmas 1.3, 1.4 below. O 


1.2. DEFINITION. We write Vx © yo(x) and 3x € ygp(x) to abbreviate 
Vx (x € y>oe(x)) and Ax (x Ey a g(x)). We write VxE € zy(x) and 
3x€ € z¢(x) to abbreviate Vy € z Vx € ygp(x) and dy € z Ax E ye(x). 
We call these new items of notation bounded quantifiers. A formula is Ao if 
it can be so written, using these abbreviations, as to contain only bounded 
and no ordinary quantifiers. 

The following result comes from Lévy [1965], which contains much 
further information on absoluteness. 
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1.3. Proposition. Ao-formulas are absolute. 


Proor. In general ‘“‘for every set x’? and “for every x € M”’ are not 
equivalent, so truth inside a CTM M and genuine truth can diverge. But if 
y EM, then “for every x € y” and “for every x © M such that x € y” 
amount to the same thing, since by transitivity all x € y belong to M. Since 
A,-formulas contain only bounded quantifiers like ‘‘for every x € y”’, for 
these formulas truth inside M and genuine truth coincide, i.e. they are 
absolute. O 


1.4. Lemma. The notions of Lemma 1.1 can be expressed by Ao-formulas. 


Proor. We exhibit the formulas (leaving items in parentheses in 1.1 to the 
reader). 


(i) VweEydazeEx(weEz)r~naVwEeEx(wey), 

(ii) VwEz(wExnwEy)aVweEx(wEy-wez), 
(iii) xEzayEzaVweEz(w=xvwe=y), 

(iv) Azo Ez 3z,€ z (z = {Z0, 21}. A Zo = {x, x} A 21 = {x, y}), 
(v) Vwez3dx'e€xdy’Ey(w=(xy))a 


AWx'E x VWy'E y aw Ez (w= (x',y’)), 
(vi) WweEzdxE€ Ew FyE Cw (w=(x,y)a 
AWW E z Ww, € z Wx0E E wo VxrE E wi WyoE E wo VyiE E wi 
(Wo = (Xo, Yo) A Wi = (X1, V1) A Xo = X1—> Yo= Yi), 
(vii) Vwez3x'€x dye ew(w=(x',y)a 
AWVx'E€xdwezdAyeEew(w=(x',y)), 
(viii) (z is a function) a x = domz arangez Cy a 
AVwoE zVwi€ zVxExVxiEx 
VyoE y Vy & y (Wo = (Xo, Yo) A Wi = (Xn, Vi) A 
A Yo= Yi Xo = X1), 
(ix) VyExVzey(zEx), 
(x) (x is transitive) a Vy € x (y is transitive), 


(xi) OR(x)aVzeExdyeEx(zey), 
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(xii) Lim(x)ny CxaVzExdwey(zew), 
(xiii) (y is a binary relation) , x = dom y a x = rangey A 


AWz Ex dwe y(w =(z,z)). 


It is understood that in (iv) ‘‘z ={zZo,z,}’ is to be written out as a 
A.-formula using (iii). O 


1.5. Consequences of absoluteness 

(i) We do not constantly have to be saying such things as, ‘“‘Let f € M be 
such that ME f is a function’. It is enough to say, ‘““Let fE M be a 
function”. By absoluteness of being a function, inside or outside M makes 
no difference. 

(ii) Absoluteness guarantees that CTM’s are closed under many set- 
theoretic operations. If M is a CTM, a € M a binary relation, then it is a 
theorem (of naive set theory, and of ZFC) that we can form the set range a 
of 2nd components of elements of a. This being true inside M, there must 
be bE M such that MF (b = rangea). By absoluteness of ranges, this b 
can only be the genuine range of a, so range a € M, and M is closed under 
taking ranges. 

Similarly, M is closed under forming pairs, under M and x, etc. Also w 
and all finite ordinals belong to M. From 1.6 below it will follow M is 
closed under the operation ¥. (a) ={b Ca: b is finite}. 

(iii) Zermelo’s Separation Axiom, an important axiom of ZFC, says that 
for any set a and formula ¢(x) we can form {b € a: ¢(b)}. Applied to a 
CTM M this means that for a € M we can form {b € a: MF ¢(b)} inside 
M. In case ¢ is absolute, this equals {b € a: g(b)}. Thus we can “‘separate’’ 
the ordinals, or functions, etc. fom a€M forming {bE a: OR(b)}, 
{f Ea: f is a function} in M. 

(iv) We can combine closure properties of a CTM M to get new ones. 
Thus if a,--- a, € M, then by closure under singletons and union, we see 
{a,---a,}€ M. Thus M is closed under forming finite subsets. 

Combining closure under 1, X, range, and ¥.,, and separating ordinals 
and functions, we see that the following belong to M if a,b do: 


{c € range a: dd € b ((d,c) € a)} = range(a NM (b X range a)), 
{b € range a: OR(b)}, 
{f: (f is a function) a (dom f is finite) , dom f C a a range f C b} = 
={f € F.(a x b): f is a function}. 
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Below when working with a CTMM we frequently assert that various 
sets belong to M. These claims can be justified by the method just 
illustrated (though sometimes it is necessary to check that some notion not 
on the list 1.1 is absolute). On first reading it may be well just to trust us on 
these claims so as not to lose the thread of our arguments. 

(v) By absoluteness, for any CTMM and any a€M, we have 
MF OR(a) iff a is in fact an ordinal. Let OR™ be the least ordinal a€ M. 
Then if BE M, y <8 (i.e. y € B), then y € M. Thus {8B € M: OR(B)} = 
{B: B <OR™}=OR™. 

(vi) Some nonabsolute notions: For a CTM M and a E€ M, write P™(a) 
for the power set of a in the sense of M, i.e. the b€ M such that 
MF(b = Y(a)). Using the absoluteness of the subset relation, we saw in 
0.3 above that P™ (a) = A(a) NM M. Similarly, write card’ a for the b € M 
such that M(b =carda). The absoluteness of being an ordinal or a 
bijection then tell us that card’ a is the least ordinal a <OR™ such that 
there is a bijection between a and a@ which belongs to M. If a € P™(a) for 
some a <OR™, then the absoluteness of CUB shows that ME (a is 
stationary in a) iff aM C#®@ for every CUB C Ca which belongs to M. 
Power set, cardinality, stationary subset are not themselves absolute 
notions, but absoluteness of some related notions has enabled us to see 
what these notions do amount to inside a CTM. 

Intuitively a property is absolute if we can check whether it holds for 
elements of a CTMM without leaving M. Thus to check for a,b,c EM 
whether c = {a,b}, we need to check that a,b€c and that c has no 
elements other than a, b. Since all elements of c belong to M, we do not 
need to leave M to do this. By contrast, to check fora <OR™ andaCa 
whether a is a stationary subset of a requires checking whether aN C46 
for all CUB C Ca. Since not all such C belong to M, we are unable to 
check this inside M. 


1.6. Lemma. The following are absolute: 
(i) x is finite, 
(ii) y= P. (x), 
(iii) y is a wellordering of x, 
(iv) (y is a wellordering of x)’ OR(z) A (z is the order type of y). 


ProoF. It is known that these items cannot be expressed by A,-formulas, so 
we must use instead our intuitive understanding of the meaning of 
absoluteness. We give the proof for (i), leaving (ii) to the reader. (iii) 
requires more difficult arguments and its proof will be omitted. (See LEvy 
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[1965].) (iv) follows using (iii) and the absoluteness of being an isomor- 
phism between orderings (a slight extension of the absoluteness of being a 
bijection). 

Let M be a CTM, a € M. Tocheck whether a is finite, we need to know 
whether there is a bijection from a finite ordinal onto a. Being a finite 
ordinal and being a bijection are absolute. Moreover all finite ordinals 
belong to M. Thus to prove absoluteness it suffices to show that if there 
does exist a bijection f between a and a finite ordinal n, then this f belongs 
to M. But f is a finite set of ordered pairs of elements of M, and M is closed 
under pairing and finite subsets, so f€ M as required. O 


Armed with the information on CTM’s provided by this section, we can 
turn to a study of the forcing method for constructing CTM’s with special 
properties. 


2. Forcing 


The methods of CoHENn [1966] have been enormously streamlined and 
generalized through the work of many set-theorists (not including the 
present writer). Since many were involved, we attach no individual names 
to the basic lemmas in this section and the next. We do credit particular 
consistency results obtained by the forcing method to their authors. Our 
account of forcing derives from SHOENFIELD [1971] and from lectures of 
Solovay, among the foremost contributors to the theory of forcing. 


2.1. DeFinition. A PO set is a pair P =(|P|,<p), where 

(i) <p partially orders | P| (i.e. is a reflexive, transitive, antisymmetric 
relation on | P|), 

(ii) there exists a <p-greatest element 1p, 

(tii) there exist no <p-minimal elements. We usually identify the 
structure P and the underlying set | P|, and drop subscripts from < and 1. 
If P is a PO set and p,q € P, we say: 

p<q iff p=q and not q=p, 

p,q are comparable iff p=q orq=p, 

p,q are compatible ‘iff there exists r with r=p and r<=q, 

P,q are incompatible iff no such r exists. 

If P is a PO set and A CP, we say: 
A is open iff whenever p€ A and q =p, then gE A, 
A is dense iff for every p €& P there is qE@ A with 7 =p, 
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A is an antichain iff any two distinct elements of A are 
incompatible. 
Note that an antichain A C P is maximal iff every p © P is compatible 
with some q € A. We say A C P is dense below p € P iff for every p'= p 
there is gE& A with q =p’. Antichains below p are similarly defined. 

If P,Q are PO sets, we say Q is a suborder of P if|Q|C|P|, 1-€|Q| 
and <g is just the restriction of <p to| Q|. If Lim(A) and P,, a < A, are PO 
sets, we say the sequence of P,’s is increasing if P. is a suborder of Pg for 
a < B. The union of such an increasing sequence is the PO set P, whose 
underlying set is the union of the underlying sets of the P.’s and whose 
(partial) order relation is the union of the order relations of the P,’s. These 
notions will become important in Section 5. 


2.2. DEFINITION. Let P bea PO set, F any family of sets, G C P. Then G is 
F-generic iff the following hold: 

(i) Whenever p € G and p <q, then gEG. 

(ii) Whenever p, q € G, then there exists r€ G withr =p andr<q.In 
particular, any two elements of G are compatible. — 

(iii) GA DA for any dense D C P with DE F. 


2.3. PRoposition. Let P be a PO set, F a family of sets containing only 
countable many dense subsets of P, and p€P. Then there exists an 
F-generic G C P with pE G. 


Proor. Enumerate the dense subsets of P in F as {D,: n © w}. We pick 
elements p, of P as follows: Let po = p. If p, is defined, pick p,+i1< p, with 
pr+iEG D,. This is possible since D, is dense. Let G={qEP: 
dn € w (p, = q)}. Clearly p € G and 2.2(i), (iii) hold. Now (ii) also holds. 
For if q,r © G, say pm =q, Pra =r, then p. is = both q and r, where 
k=max(m,n). O 


Let M bea CTM, PE M a PO set. Taking F = M in 2.3, we see for any 
p © P there exists an M-generic G C P with p € G. For most P it can be 
shown that no such G belongs to M. 


2.4, DEFINITION. Let P be a PO set. Any subset of a set of form P X a is 
called a P-term. If tC P X a isa P-term and G C P, the G-interpretation of 
t, Ic (t) is defined to be {b € a: dp € G ((p, b) E t)}. 


Examples. For any a, let a* = P x a. Then for any nonempty G, I, (a*) = 
a. Let G = {(p,p): p © P}. Then for any G, Ig (G)= G. 
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For proofs of the following lemma, and of Lemmas 2.9 and 2.11, we refer 
the reader to SHOENFIELD [1971]. It is hardly surprising that in a short 
exposition we should have to assume some results without proof. What is 
rather surprising is that these three lemmas imply everything we will need 
to know about forcing. 


2.5. EXISTENCE AND MINIMALITY LEMMA. Let M be a CCTM, P © Ma PO set, 
and GCP M-generic. Then there exists a smallest CTM N with the 
properties that MCN and GEN. Moreover, if aC M and aEN, then 
a=I,(t) for some P-term t € M. 


We write N= M[G] and call N the Cohen extension obtained by 
adjoining G to the ground model M. 

(Notice that since G¢& M, N is a proper extension of M. Notice also that 
by remarks in 1.5(iv), any CTM N with ¢, G € N will have Ic (t) € N. Thus 
aCM isin M[G] iff a has form I(t) for some t € M.) 


2.6. PRoposiTion. Let M be a CTM, PE Ma PO set, GC P M-generic, 
M=M[G}. Then OR™ = OR”. 


Proor. Recall OR” = {a € M: OR(a)} = the least ordinal a€ M. So it 
suffices to show OR” & N. Suppose the contrary. Then OR™ = Ig (t) for 
some t € M. By our remarks in 1.5(iv), whenever a € M, {a € rangea: 
OR(a)} EM. Hence OR™= {a E range t: OR(a)} EM, a contradiction. 1 


2.7. DEFINITION. Let M be a CTM, PEM a PO set, pE P, ti---t, EM 
P-terms, and g(x,---x,) a formula with free variables x,---x,. We say p 
P-forces p(t::** t,) over M, and write 


P lkm (ti: th) 


if M[G]F ¢Uoe (t)-:: Ic (t,)) for every M-generic G C P with p € G. We 
generally omit the sub- and superscripts on IF. 


Example. If p <q, and q*,G are the terms defined in 2.4 above, then 
p|+q* € G. For if G C P is M-generic and p € G, then q = Ic (q*)E G= 
I;(G). If instead p,q are incompatible, then p|tq*¢ G, since pE G 
implies q¢ G for generic G. 


We speak of an English expression being forced when the corresponding 
formula is. Intuitively, someting is forced by p if we can tell it is going to be 
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true in M[G] just by knowing that p € G. Since an element p of P thus 
restricts the possibilities as to what may happen in M[G], the elements of P 
are often called conditions. When p = q, p is more restrictive than q (p € G 
implies q € G for generic G), so p is said to be a stronger condition. 


2.8. Proposition. Let M be aCTM, P€ Ma PO set, p € P, a EM, 9g, etc. 
formulas. Then the following hold: 
(i) Ifplkei:-::pit¢n, and if ~.--: pn together with the axioms of ZFC 
logically imply ¢, then p \F g. 
(ii) Not both p|t ge and p|k—@. 
(iii) If p|te and q =p, then ql ¢. 
(iv) If pt p and q|t— 4, then p,q are incompatible. 
(v) pike aw iff both plt ge and pik wy. 
(vi) pI-K Vx € a* 0(x) iff for all b Ea, p|t 0(b*). 
(vii) p IK Wx (x Ca*— O(x)) iff for all te AP™(a*), pir A(t). 


Proor. These are pleasant exercises in the definitions, and readers may 
wish to try them before reading the hints that follow: 

(i) For any M-generic G C P with p € G, ¢1-:- ¢, and all the axioms of 
ZFC are true inside M[G], hence so is ¢. 

(ii) By 2.3 there exists a generic G with p € G. Not both M[G]F ¢ and 
M[G]F 7 ¢. 

(iii) When q <p, any generic G with q € G also has p€ G. 

(iv) Follows from (ii) and (iii). 

(v) The RHS of (v) means that first for any generic G with p€G, 
M[G]F¢, and second that any such M[G]F yw. The LHS means that 
for any such G, M[G]F ¢ 4 & — which is the same thing! 

(vi) For any generic G, any element of Ig (a*) = a has form Ig (b*) = 5 
for some b € a. This noted, (vi) becomes, like (v), a tautology. 

(vii) For any generic G, any subset of a in M[G] has form Ig (t) for 
some t € M. It is no real restriction to consider only t C a*, since in general 
Ig (tN a*)= Ig (t)Na. This noted, (vii) is a tautology. O 


In order to determine when pi+—g, we need a deeper result. 
(See SHOENFIELD [1971].) Note the “‘if’ part of the following is part of 
Definition 2.7. 


2.9. TRUTH LEMMA. Let M be a CTM, P © Ma PO set, G C P M- generic, 
tiit+t,&M P-terms, and o(x1-::xX,) @ formula. Then M[G]F 
o (6 (ti)- ++ Ic (t.)) iff there is a p © G such that pltye (th: th). 
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2.10. Proposition. Let the notation be as in 2.8. Then: 
(i) pit 4@ iff no q =p has qi ¢. 

(ii) If {7 © P: rit p} is dense below p in P, then p\t g. 

(iii) p lt @ v Ww iff for every q = p there exists r = q such that either r |t ¢ or 
ri\k wh. 

(iv) p|k 3x € a*6(x) iff for every q = p there existr = q and b Ea such 
that r\t 0(b*). 

(v) pik ax (x Ca*an O(x)) iff for every qsp there exist r=q and 
t © P™(a*) such that r{t @(t). 


Proor. (i) The “only if’’ part follows from 2.8(iv). For the ‘if’ part, 
suppose we do not have p |t >. Then there is a generic G with pE G 
and M[G]F ¢. By the Truth Lemma, there is g € G with q |t ¢. We may 
suppose q < p. If not, there is q’& G with q'=p, q'=q. By 2.8(iii) this q’ 
still forces y. Thus if not p |k 4, there is q = p with q It y. Contraposing 
gives (i). 

(ii) If {r € P: rlt g} is dense below p, then by 2.8(iv), no q = p can force 
— ¢. By (i), plik 7, and by 2.8(i), p ik ¢. 

(iii), (iv), (v) follow from (i), 2.8(v), (vi), (vii), and the equivalence of 
gvws to 7(7¢ a7) and 3x O(x) to WVx T A(x). O 


2.11. DEFINABILITY LEMMA. To every formula p(x,+++ Xn) we can associate 
a formula p*(x1-+* Xn, y,Z) such that for any CTM M, any PO set P € M, 
any p € P, and any P-terms t,--+t, © M we have: 


pike (tis ++ tn) iff ME 9"(t.- +> th p, P). 


For a proof we refer the reader again to SHOENFIELD [1971]. Note that 
our definition of forcing (2.7) does not directly provide us with suitable 
formulas g”*. The definition of forcing in terms of M-generic sets is 
something we can not talk about inside M; inside M there just aren’t any 
M-generic sets. In fact p” must be obtained quite deviously and is quite 
complicated even for simple ¢ like (x: € x2). 

An important consequence of 2.11 is the following closure property: If M 
isa CTM, PEM a PO set, T,:--T, EM sets of P-terms, p(x1°°- Xn) a 
formula, and 


E ={(p,tre++t))E PX T,X +++ X Tr: ple g(t +++ ta} 


then E € M. This follows from Zermelo’s Separation Axiom (cf. 1.S(iii)) 
since E = {(p,ti-++ th): ME o"(ti-+* ty p, P)}- 
By abuse of language we will write, e.g. MF (p |t’t is a cardinal) when 
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we mean MF "(t, p, P), where g(x) expresses that x is a cardinal — in 
other words, when we mean that p|ltyt is a cardinal. All such talk of 
forcing inside a CTM M involves implicit appeal to the Definability 
Lemma. 

It will be useful to know that the notions introduced in Definitions 2.1 
and 2.2 are absolute. 


2.12. LEMMA. The following are absolute: 

(i) x is a PO set. 

(ii) (x isa PO set)a y, z E|x|ay<,z (ory <z; ory, z are comparable, 
or compatible, or incompatible). 

(iii) (x is a PO set), y C|x| ay is open (or dense, or an antichain, or a 
maximal antichain, or dense below x'€|x|, etc.). 

(iv) (x is a PO set) a(y is a family of sets), z C|x|az is y-generic. 

(v) (x isa PO set)ny C|x|azElx]adz'Ey(zs,z'). 


Proor. We can express each of these by a Ao-formula. E.g. (i) becomes: 
Ax.€ Ex 3dx,€ © x (x = (xo, x1) 
A (x, is a reflexive, transitive, antisymmetric binary relation on Xo) 
Ady ExoWy’ Ex dz Ex (z =(y', y)) 
AWy © xody’E xo (4z € x1 (z = (y’, y)) 
Av dz € xi (z ={y, y’)))). 


Here reflexivity, etc. are to be expressed as Ao-formulas by Lemma 1.4. 
Alternatively, we could work with our intuitive understanding of absolute- 
ness and show that fora CTM M and P € M, we can check whether P isa 
PO set without leaving M. This is in fact not hard to see, since having all the 
elements of P available to us inside M, we can easily check for the 
existence of a maximum and the nonexistence of minimal elements. 

We leave verification of absoluteness for (ii}-(v) to the reader. O 


Our remarks 1.5 on consequences of absoluteness apply also to these 
notions. Thus we don’t have to say, ‘““Let P € M be such that MF (P isa 
PO set)”. We can just say, “Let P€ M be a PO set”. Thus also given a 
CTM M, a PO set PEM, and AE F™(P), we can form D= 
{q € P: dp € A (q =p)} inside M, since this just involves ‘‘separating” the 
elements of P satisfying a certain condition which 2.12(v) tells us is 
absolute. Below we will frequently claim that certain sets whose definitions 
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involve notions pertaining to PO sets can be formed inside a CTM. On first 
reading it may be best just to trust us, but these claims can all be justified 
by showing some formula to be absolute. We give one example in the next 
lemma. 


2.13. LEMMA. Let M be a CTM, PEM a PO set, pEP, GCP an 
M-generic set with p€ G. Then GN DA QO for any DE P™(P) which is 
dense below p. 


Proor. Let E ={q € P: q€© Dvgq,p are incompatible}. This set can be 
formed inside M since its definition is absolute (cf. 2.12(ii)). Any q € P 
which is not incompatible with p has an r<q with rsp, and hence by 
density of D below p, an s =r <q with s € D. This shows E is fully dense 
in P, so by M-genericity, GM E# @. But since p € G, no element of G is 
incompatible with p, so GQ D#@ as required. O 


The following strengthening of 2.10(v) will not be used until Section 6, 
and can be skipped on first reading. Its proof is a tour de force. 


2.14. THEOREM. Let the notation be as in 2.8 and 2.10. Then p |t 3x (x C 
a* n 0(x)) iff there exists t€ P™(a*) such that p |t 6(t). 


Proor. The “‘if’’ part is immediate. We break the proof of the ‘‘only if” 
part into a series of lemmas. 


2.15. LEMMA. Let M be a CTM, PE Ma PO set, pE P, AE P™(P)a 
maximal antichain below p. Then: 
(i) For any M-generic G C P with p € G, there is a uniqueq € A NG. 
(ii) For any formula 9, if q\t ¢ for all q © A, then p\t o. 


Proor. (i) Let G be generic, p € G. Since elements of generic sets are 
compatible, A ©G contains at most one element. Let D={re P: 
dq € A (r =q)}. We remarked (following 2.12) that this set can be formed 
inside M.:‘We claim D is dense below p. 

For let p’=p be arbitrary. By maximality of A, there is qEA 
compatible with p’. Hence there is r = p’ with r=q € A, whence r€ D. 
This proves density. 

By 2.13, DN GA®. If r€ DNG, there is gE A with r<q, whence 
q ©G. Thus A N G# O as required. 

(ii) Suppose q lt ¢ for all gq € A. Let G be generic with p € G. By (i), 
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AN G#, and there is q € G forcing y. Thus M[G]F ¢, as required to 
show pliky. O 


2.16. LEMMA. Let M be a CTM, PG Ma PO set. Then: 

(i) If A € P™(P) is a maximal antichain, and F € M a function with 
dom F = A and F(q) a P-term for every q€ A, then there is a single 
P-term t such that q|+ t = F(q) for every qE A. 

(ii) If p © P and t,u © M are P-terms, then there is a P-term v © M such 
that p|t v = t and any q incompatible with p has q |v = u. 


Proor. (i) Since A, F € M, we can form inside M a = U{range F(q): 
q € A}, a single set with F(q)C P x a for all g © A. We can also form 


t={(,b)€Pxa:3qEAApEe P(r<qarspa(p,b)€ F(q))}. 


It suffices to check that for generic G and q € ANG, Ig (t) = Ic (F(q)). 

Suppose first b € Ig (F(q)). Then (p, b) € F(q) for some p € G. Take 
r€G with r<q and r<p. By construction (r,b)€t, so bE I(t). 
Conversely, suppose b € I,(t), ie. there is r&€G with (7,b)Et. By 
construction there are q’'€ A and p such that r=q', r=p, and 
(p, b) € F(q’). Since r=<q’', q'€ G. Since ANG contains just one ele- 
ment, q'=q and (p,b)€ F(q). Since r=p, p€G, and _ finally 
b € Ic (F(q)) as required. 

(ii) Since Zorn’s Lemma is true inside M (and since being a maximal 
antichain is absolute, cf. 2.12(iii)), there must be q maximal antichain 
AEM with p € A. Let FE M be the function with dom F = A, F(p)=1, 
and F(q)= u for all other q € A. Applying (i) to this F we get a term v 
with p|t+v =¢ and q|tv =u for any other q € A. Now if q is anything 
incompatible with p, A — {p} is a maximal antichain below gq, so by 2.15(ii), 
qitv =u as required. O 


Proof of the ‘only if” part of 2.14 modulo 2.15 and 2.16: 

With notation as in 2.8 and 2.10, we have pltdx (x Ca*a 6(x)). 
Applying Zorn’s Lemma inside M (and absoluteness considerations), there 
is an FE M maximal with respect to the property: 


(F is a function) a (dom F is an antichain below p)a 
Aq |t @(F(q)) for all gq € dom F. 


(Recall that the set E = {(q,t)€ P x P™(a*): q lt 0(t))} belongs to M asa 
consequence of the Definability Lemma. The last condition in the defini- 
tion of F just says that F C E.) 
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We claim domF is a maximal antichain below p. For assume for 
contradiction that p’=p is incompatible with all q@domF. Since 
p'\t dx (x C a* a 0(x)), by 2.10(v) there are p” <p’ and t € P™(a*) such 
that p”|lt @(t). A fortiori p” is incompatible with every qE@ A. So FU 
{(p", t)}} is a counterexample to maximality of F, a contradiction. 

Now apply 2.16(i) to this F to obtain a term ¢ with q lk t = F(q) for all 
q€domF. Since q\t+6(F(q)) for all such gq, qlt@(t). By 2.15(ii), 
plra(t). O 


2.17. CoROLLARY. Let notation be as in 2.8, 2.10, and 2.14. Suppose 
1p [HK Sx (x Ca*an O(x)), and t€ P“(a*), and p\tO(t). Then there is 
v © P™(a*) such that 1p\|+ 0(v) and p|kv =t. 


Proor. Note that in general 1p | ¢ iff every q € P forces ¢. By 2.14 there 
is u with 1p |k @(u). Let A © M be a maximal antichain with p € A. Using 
2.15(i) we get a v with p |+ v = ¢ and q | v = u for every other q € A. Thus 
all gE A force 6(v), so 1p lt A(v) by 2.1577). O 


It is often convenient to write P|t g for 1p lk g. 


3. The Continuum Hypothesis 


What does it mean for the Continuum Hypothesis (CH) to be false, i.e. 
for — CH to be true, inside a CTM M7? If « is a cardinal in M, i.e. if MF (« 
is a cardinal), let us write (2")“ or exp2”(x) for card“(P™(x)). We also 
write («*)” for the A € M such that MF (A is the least cardinal > «), and 
wt for (w*)”, w% for (wi’)*), etc. Now CH is false iff there is an injection 
from w, into P(w), or equivalently if there is a function g : w2 X w — {0, 1} 
such that: 


(*) Whenever a # B and (a, 0) and (8,0) € dom g, 
then there exists n € w such that g(a, n) ¥ g(a, n). 


(Such a g gives rise to an injection f(a)={n € w: g(a,n) =0}.) Thus 
MF* —CH iff there is an injection f € M from w% to P™(w). (Recall that 
being an injection is absolute.) Equivalently, M — — (CH iff there is a map 
g ©M from wx w to {0, 1} satisfying (*). 

If we start with an arbitrary CTM M (we are assuming one exists), we 
may have such a g € M (i.e. M* — CH) and we may not. CoHEN [1966] 
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shows how to construct an extension N of M in which such a g does exist 
for certain. 


3.1. DEFINITION. Let @ be an ordinal. Consider 
P ={p: (p is a function) a (dom p is finite) 
a dom p Ca X w arange p C{0, 1}}. 


Partially order P by reverse inclusion: p = q iff q Cp, i.e. if p is a function 
extending q. This makes P a PO set. The maximum element I> is the empty 
function. There are no minimal elements since every function in P has a 
proper extension in P. We call this P the a-Cohen PO set. Note that if M is 
a CTM and a < OR™, then by remarks in 1.5(iv), the a-Cohen PO set can 
be formed inside M. Elements of the a-Cohen PO set are in an obvious 
sense “finite approximations” to a function a X w — {0, 1}. 


3.2. LEMMA. Let M be a CTM, P the w!'-Cohen PO set, G C P M-generic, 
N=M[G], and g EN the union UG of all functions p € G. Then: 
(i) g is a function. 
(ii) dom g = w}X w. 
(iii) g satisfies (*) above. 


Proor. (i) Clearly g C wx w x {0, 1}. If g were not a function, there 
would have to be p,q€G and (a,n)EdompNdomq such that 
p(a,n)# q(a,n). But then there could not exist any function r extending 
both p and q. But any two elements of G are compatible in P so such r does 
exist, a contradiction which establishes (i). 

(ii) Let a<w, n<w be arbitrary. Let D = {p € P: (a,n) € dom p}. 
Then D is dense, since if p€&P, either p€D already or else p’= 
p U{(a, n,0)} is in D and is =p. By density there is p€ DMG. Then 
(a, n) € dom p C dom g, proving (ii). 

(iii) Let a < B < w7 be arbitrary. Let 


D ={p € P: An € @ ((a, n), {B, n) € dom p a p(a, n) # p(B, n)}. 


Then D is dense, since if p € P, then by finiteness of dom p there is n € w 
with (a, n),(B,n)¢ dom p. Setting p’= p U {(a, n, 0), (B, n, 1)}, we have 
p'<p and p’€D. By density there is pE DMG. But then g(a,n)= 
p(a, n) ¥ p(B, n) = g(B, n), proving (iii). (There is no trouble in seeing that 
the sets D required in (ii) and (iii) can be formed inside M.) O 


It is important to notice that 3.2 does not yet tell us that NF —CH. To 
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get that, we need a function g EN satisfying (+) and having domain 
w}X w. We will in fact show NE —-CH by showing w2'= w%. First a 
simple remark. 

If M, N are CTM’s with M CN, and x < OR™ is a cardinal in N, then it 
must also be a cardinal in M. For if there exist A < « and a function f€ M 
from A onto x, then this same f is present in N, and « cannot be a cardinal 
there. If « is any cardinal in M, we say u is preserved in N if yw is still a 
cardinal in N, and otherwise yw is collapsed. Collapsing occurs when there is 
a function f € N—M from a smaller ordinal onto yp. 

To prove that w= w* in the situation of 3.2, we need to show that w/* 
and ware preserved in N. We do not, by the above remark, have to worry 
about any new cardinals appearing in N. We do have to worry that w3‘ 
might be collapsed (and so not be‘a cardinal at all in N) or that w might be 
collapsed (so that 3‘ would not be the second uncountable cardinal in N). 
Several preliminaries are required before we can rule out these pos- 
sibilities. 


3.3. DEFINITION. Let P be any PO set, x any uncountable cardinal. We say 
P satisfies the x-antichain condition, or is «-AC, if there is no antichain 
A CP of cardinality = x. Thus trivially P is (card P)*-AC. Unfortunately 
the w,-antichain condition has come to be called the countable chain 
condition (CCC). This usage is confusing, but well established. 


3.4. THEOREM. Let M be any CTM, PEM any PO set, x € M. Suppose 
MF (k is a regular cardinal » P is k-AC). Then P\t(x* is a cardinal). 


Proor. Note that Pit (x * is a cardinal) is another way of writing 1p It (« * is 
a cardinal) and is true iff for every M-generic G C P, x is preserved in 
M[G]. Assume for contradiction that G is generic, and f € M[G] maps 
some A <x onto x. By the Minimality Lemma 2.5, f = Ic (t) for some 
tE€ P™(PXAXx«). By the Truth Lemma 2.9, some p € G forces (f is a 
surjection from A* to « *). 

All the following is then true inside M: For a <A let 


A. ={B <«: 3q = p(q lt t(a*) = B*}. 


A, may be thought of as a set of “‘possible values of t(8*)”. For B € A., 
pick q(8) <p with q(B)\t t(a*) = B*. For distinct B the q(f) are incom- 
patible, since they force contradictory things. Since P is k-AC, it follows 
card A, < x. Since « is regular, card A < x, where A = U.<, Aa (All this 
talk of forcing inside M is justified by the Definability Lemma 2.11.) 
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The above shows card” A < x. We obtain a contradiction by showing 
«x CA, so that « cannot be a cardinal in M. To see this, let B<x« be 
arbitrary. B = f(a) for some a <A. By the Truth Lemma, some q EG 
forces t(a*) = B*. We may suppose q = p. (If not, there is r € G withr =p 
and r <q. But r<q implies that r still forces t(@*)= B*.) Then q is a 
witness that B € A. C A. Thus x CA as asserted. O 


Note that in the situation of 3.4, if A is a cardinal in M with x =A, thena 
fortiori ME(P is A-AC), and P| (A* is a cardinal). Thus if MF (P is 
CCC) then all cardinals of M are preserved in all Cohen extensions M[G] 
for M-generic G C P. Note also that trivially any P € M is really CCC, 
since it is countable. What matters is whether it is true inside M that P 
is CCC. 


3.5. LEMMA. The a-Cohen PO set is CCC for any ordinal a. 
For the proof we assume the following result: 


3.6. COMBINATORIAL LEMMA. Let x be a cardinal such that x* = « for all 
A <x. Let F be a family of > x sets each of cardinality < x. Then there exist 
E C F of cardinality > and a fixed set e of cardinality <x, such that for 
all distinct d,d'€ E, dN d'=e. 


The hypothesis on « is satisfied by « =w. Assuming CH, it is also 
satisfied by x = w,. The hypothesis implies « is regular, since x“* >« for 
any cardinal x. A proof of 3.6 in the case x = w can be found in 5.7 of 
Chapter B.3 or in MarczeEwski [1947]. The full 3.6 is an easy generaliza- 
tion. 


PRooF OF 3.5 (modulo 3.6). Assume for contradiction there is an uncount- 
able antichain A in the a-Cohen PO set P. For fixed finite a C a X w, there 
are only finitely many functions from a to {0,1}. Thus F = {dom p: p € A} 
must be uncountable. Apply 3.6 with x =w to obtain an uncountable 
EC F and a fixed finite e such that for distinct d, d’E E, dN d'= e. Since 
there are only finitely many functions from e to {0, 1}, there must be 
p,q © A with dom p, dom q distinct elements of E and p,q agreeing on e. 
But then r = p Uq is a function extending both p and q, and so p,qE A 
are compatible, a contradiction. O 


Now let M, P, G, N be as in 3.2. Applying 3.5 inside M, MF (P is CCC). 
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Thus by 3.4 every cardinal of M is preserved in N, and in particular 
w}’ = w%", which was the last step required to prove: 


3.7. THEOREM (Cohen). There is a CTM N wtih NF TCH. 


What does it mean for CH to be true inside a CTM M? It means that 
there is a surjection g € M from w% onto P™ (w). If such a g does not exist 
(i.e. M - — CH) we show how to construct an extension N of M in which 
one does exist. 


3.8. DEFINITION. The anti-Cohen PO set is the set 


P= {p: (p isa function) a carddom p = w 
a dom p G @: A range p CP(w)}, 


partially ordered by reverse inclusion. 


3.9. LemMA. Let M be a CTM, let PEM be such that MK (P is the 
anti-Cohen PO set), let GC P be M-generic, N= M[G], g = UGEN. 
Then g is a surjection from wi to P™(w). 


Proor. Note that what P really is is 
{p € M: (p is a function) a card’ dom p < w 
Adom p C win range pC P™(w)}. 


To see that g is a function, argue as in 3.2(i). To see range g = P“(w) and 
dom g = w‘", argue as in 3.2(ii), using the density of {p € P: a € range p} 
for all a € AP“ (w) and the density of {p € P: a Edom p} foralla < wf. O 


3.9 does not yet show N F CH, but to get that it will suffice to show that 
P(w)= P™(w) and that w" is preserved in N (so that w= w}’). This 
requires some preliminaries. 


3.10. Derinition. Let P be any PO set, « any uncountable cardinal. P is 
«-closed if whenever A =« and p,,&<A, is a decreasing sequence in P 
(i.e. p, = pe when € = 7), then there is a p € P which is = every p,. P is 
«-distributive if any intersection of « open dense subsets of P is dense. 
(Trivially such an intersection is open.) 


Example. The anti-Cohen PO set P is w-closed. If we have a countable 
sequence of functions in P with later members of the sequence extending 
earlier members, then the union p of all members of the sequence is still a 
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function with countable domain, hence is an element of P extending every 
member of the sequence. 


3.11. PRoposirion. Let P be a PO set, k an uncountable cardinal. Then: 
(i) If P is x-closed, then P is «x-distributive. 
(ii) If x is singular, and P is X-closed (or d-distributive) for every A < x, 
then P is x-closed (resp. «-distributive ). 


ProoF. (i) Suppose P is x-closed and D,,  < x, are open dense subsets of 
P. To show their intersection dense, consider an arbitrary p € P. We forma 
decreasing sequence p,, € = x, of elements of P as follows: Let po = p. If p; 
is defined, pick p..,;< p and belonging to D, (such exists by density). If 
A =x and Lim(A) and , is defined for <A, let p, be = all pp, E<A 
(such exists by «-closure). Then we have found p, = p with p, € M,-. D; 
(by openness, since px S pe+1€ D,). This proves the required density. 
(ii) For closure, note that from any decreasing sequence p,, € < x, we can 
extract a sequence q,, <cfx, such that for every p, there is a q, with 
in = pe. For density, note that any intersection of « sets can be rewritten as 
an intersection of cf x sets, each of which is an intersection of <x sets. O 


3.12. THEOREM. Let M be a CTM, PE€ Ma PO set, k a cardinal in M. 
Suppose M F (P is x -distributive). Let G C P be M-generic, N = M[G], and 
f€Na function with dom f = x, range f C M. Then f € M. 


Proor. By the Minimality Lemma, f=Ig(t) for some t€M, say 
t€ P“(P xXx Xb). Assume for contradiction fé X ={h € P“(k Xb): h 
is a function}. By the Truth Lemma, some p € G forces (t is a function 
até X*). We obtain a contradiction by finding q = p and h € X such that 
q forces Va < k*(t(a)=h*(a)), ie. t=h*. 

All the following is true inside M: For a <x, let D, ={q =p: Ac 
€ b(q|t t(a*) = c*)}. Then D, is open dense below p in P. Openness is 
clear. To prove density, consider an arbitrary q=p. Since 
qlt dc € b (t(a*)—c), there are c€ b and r <q such that rlt t(a*)=c* 
(this by 2.10(v)), whence r€ D,. By «-distributivity, .<.D,. is dense 
below p. (More precisely, apply distributivity to the sets E, = {q € P: p,q 
are incompatible v q © D,}, which are fully dense in P; cf. Lemma 2.13.) 
Now if q belongs to all the D,, and we define 


h={(a,c)E x Xb: qitt(a*)=c*}, 
then h © X and clearly qilk Va <«*(t(a)=h*(a)) as required. (This 
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whole argument ‘‘took place inside M”’. All the talk about forcing inside M 
involves tacit use of the Definability Lemma.) O 


3.13. COROLLARY. Let the notation be as in 3.12. Then: 
(i) P*(k) = P™(x). 
(ii) Every cardinal in M <(x*)™ is preserved in N. 


Proor. (i) If there were a € P“(x)— P™(k), its characteristic function 
would be a counterexample to 3.12. 

(ii) If A, ~ were cardinals in M with wp <A <(x*)” and fEN were a 
surjection from p to A, then the trivial extension g of f to a function with 
dom g = x, obtained by setting g(a)=0 for w a<x, would be a 
counterexample to 3.12. O 


Now let M,P,G,N be as in 3.9. Then MF (P is w-closed). By 3.13, 
P™(w) = P™(w) and w!* is preserved in N, which were the facts needed to 
prove: 


3.14. THEOREM (Gédel). There is a CTM N with NF CH. 


Gédel’s original proof of Theorem 3.14 antedated the method of forcing 
by a quarter century. His proof gives a model in which we have not only 
CH, but also GCH and an even stronger principle, the so-called Axiom of 
Contructibility (V = L). See Chapter B.5 for an account of this work. 

The following technical result will be used often below: 


3.15. THEOREM. Let M be a CTM, P € Ma PO set, and x, A, p, v cardinals 
in M. Suppose M - (v = x“ acard P= « a Pisd*-AC). Thenp | 2" = pv. 


Proor. Let GCP be M-generic, N= M[G]. We show (2“)" =v by 
showing that inside N every subset of « can be obtained using Ic from 
elements of a certain set S € M with card’ S < v. Indeed, let 


S ={f € M: (f is a function) a dom f C P X uw a range f C{0, 1}a 
Va <p (A(a, f) = {p € P: (p, e) € dom f} is a maximal antichain)}. 
We can see S EM by appealing to the Definability Lemma. 
All the following is true inside M: Any antichain in P has cardinality 
=A. Since card P = «x, the number of antichains A ¢ P is <x’. Every 


f €S is completely determined by the sequence of yw antichains A (a, f), 
a <p. Thus card S <(x*)* =k =». 
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All this was true inside M, showing card’ S < v, hence card™S < v. 
Further, all the following is true inside N: For each a <p and feES, 
A(a,f)NG has just one element (by 2.15(i)), call it q(a,f). Define 
J(f) ={a <p: f(q(a, f), «) = 0}. Then J maps S into A(z). 

We complete the proof by showing the J € N just defined maps S onto 
PN(u). To see this, let a € PX(u) be arbitrary. a = I¢(t) for some 
tE P™@(PX p). 

All the following is then true inside M: By Zorn’s Lemma, there is an 
fCPxyp x {0,1} maximal with respect to the property: 


(if (p, a, 0)€ f, then plka* €t)a (if (p,a,1)E f, then plka*Zt)a 
A(a, f) = {p: (p, a) € dom f} is an antichain for all a. 


Note that the first two conditions imply f is a function. We claim the 
A(a,f) are actually maximal antichains. For if for some a, some q € P 
were incompatible with all p € A(a, f), then since qlk(a*Etva*€ t), 
there would be an r <q such that either rlka*€t-or rlka*ét. In the 
former case, f U {(r, a, 0)} would be a counterexample to the maximality of 
f. In the latter case, f U {(r, a, 1)} would be. This contradiction shows the 
A(a, f) are maximal antichains as claimed, and hence f € S. 

It remains to show J(f) = a. Let a < w be arbitrary. Suppose first a € a. 
By the Truth Lemma, some p € G forces a* € t. Now we cannot have 
f(q(a, f), 2) = 1, for then by the defining property of f, q(a, f)€& G would 
force a* ¢ t, making q(a, f) incompatible with p € G, a contradiction. So 
we must have f(q(a, f), a) =0, and a € J(f). Similarly, if a € » — a, then 
ag J(f). Thus J(f) =a as required to complete the proof. O 


3.16. Consequences of 3.15 

Let M be a CTM with MF GCH. The work of Godel mentioned after 
Theorem 3.14 provides such an M. 

(i) Let PEM be the w?-Cohen conditions, GC P M-generic, N = 
M[G]. We have seen NF 2° > @,. Now applying 3.15 with « = v = w7 
and A= =a, we see NEF 2° =a, exactly. Next applying 3.15 with 
k = v=, A =o, and p = wi’, we see NF 2” = w2. (Note GCH implies 
w3' = w= @2.) This establishes the consistency of 2° = 2”. 

(ii) Similar use of the w3-Cohen conditions gives a model with 2° = ws. 
In fact we can get any “‘reasonable”’ value for 2°. (It has long been known 
that 2°4% w.. Use of the (w.)”-Cohen conditions turns out to make 
2° = was1.) 
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(iii) Let 
P ={p € M: (p isa function) a card’ dom p =m A 
dom p C w3X wit a range p C{0, 1}}. 


It is easy to see ME (P is w-closed). Using GCH inside M and 3.6, it can 
be seen that ME (P is w}-AC). Thus if G C P is M-generic, then by 3.4 
and 3.13, every cardinal of M is preserved in N = M[G]. It is then not hard 
to see NF2”>., and using 3.15 it can even be established that 
NF 2” = a; exactly. By w-closure and 3.13, P™(w) = P™(w), so NF 2° = 
w,. To get combinations like 2” = w; and 2° = w will require the more 
sophisticated methods of Section 5 below. 


4. Useful combinatorial principles 


The principles to be discussed in this section have proved useful 
everywhere from universal algebra to general topology. (E.g. see Chapter 
B.7.) We show each principle (relatively) consistent by constructing a 
model for it. In most cases consistency can also be established by deriving 
the principle from V=L (see Chapter B.5), but these derivations are 
usually longer than the forcing constructions used here. 


4.1. Derinition. Let « be a regular uncountable cardinal. A « Kurepa tree 
is a tree T of height « whose levels all have cardinality < x, and which has 
>« branches. The x Kurepa Hypothesis, KH(x), is the proposition that « 
Kurepa trees exist. We aim to show KH(q),) is consistent. 

Let P be the set of all pairs (T,, ,) where: 

(1) T, is a tree. 

(2) The elements of | T,| are ordinals < x. 

(3) Every level of T, has cardinality <x. 

(4) T, has height a successor ordinal a, + 1<k, i.e. T, has a top level, 
the a,-th. 

(5) J, is a bijection from a subset of «* to the top level of T,. 
Partially order P by setting p <q iff: 

(6) T, is an end extension of T,, i.e. the levels of T, up to the a,-th are 
identical with the levels of T,, but T, (possibly) has new levels above the 
a,-th. 

(7) dom |, > dom 1,. 

(8) For every p €dom],, ,(9) = 1,(p) in T,. 

Thus ordered P is called the KH(x) PO set. 
If p € P, T, may be thought of as part of a Kurepa tree T we would like 
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to construct, and J, as a labeling indicating for certain p<«* which 
element of the top level of T, belongs to the p-th branch through T (in 
some list of x* branches we are trying to construct). 


4.2. Lemma. Let « be regular and uncountable, P the KH(x) PO set. Then P 
is A-closed for all A < k. 


ProoF. Let A < x and let p,, é < A, be a decreasing sequence in P. Write T; 
for the tree part of p,, and [,, a, similarly to avoid double subscripts. Note 
that the sequence of a; is nondecreasing. Since T,, is an end extension of T; 
for <n, the T, fit together to form a single tree T of height a = 
sup{a,;: € <A}. Let d = U,.,dom. For p € d, let 


b(p) ={a € T: FE<A(p Edom! na <|,(p)}. 


Since |, (9) = 1,(p) for =» <A and p € dom |, b(p) is a branch through 
T. Moreover since the maps /, are injections, b(p) ~ b(o) for p¥# o. Forma 
tree T’ by adding to T a new level at the very top containing for each p € d 
an element a, which is = every a € b(p). Define a map / with domain d 
by setting /(p)= a,. It is easily seen that p, =(T’,/) is an element of 
P<every pn. O 


4.3. Proposition. Let « be an uncountable cardinal such that x* = « for all 
A <x, and P the KH(x) PO set. Then P is x*-AC. 


Proor. Suppose A C P has cardinality x *. We show A is not an antichain. 
The assumption on « implies that there are only « subsets of « of 
cardinality < «. Moreover for a fixed subset of cardinality A, there are only 
2° <«x* =« ways to partially order it and make it into a tree. Thus there 
are only « trees satisfying the conditions on T, in the definition of the 
KH(x) PO set. So there exist A’C A of cardinality x* and a fixed tree T 
such that T, = T for all pE A’. 

For fixed dCx* of cardinality A<« there are no more than k 
bijections from d to the top level of T, hence no more than « p € A’ with 
dom |, = d. Thus D = {dom/,: p © A‘} has cardinality «*, and we can 
apply Lemma 3.6 to get an E C D with card E = «* anda fixed e such that 
dM d'=e forall distinct d, d’€ E. Since there are only x maps from this e 
to the top level of T, there must be p,q € A’ with dom p,dom q distinct 
elements of E, and p,q agreeing on e. We complete the proof by showing 
such p,q are compatible. 

Let us add a new level to the top of T containing one new point above 
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each element /,(p) for p € e, and two new points above each element /, (p) 
for p € dom I, — e. (Note any such element is also of the form /,(a) for 
some ao € dom], — e.) This forms a new tree T’. Define a map 1’ with 
domain dom /, Udom I, by setting: 


the unique point above ,(p), if p€e, 
I'(p) = 4 the 1st point above /,(p), if p€doml, —e, 
the 2nd point above 1, (p), if p€domi, —e. 


It is easily seen that (T’,/') is = both p and q in P as required. O 


We leave it to the reader to check that if x is regular, P the KH(x) PO 
set, then the sets D, ={p CP: a, >a}, for a <x, and the sets E, = 
{p © P: p€ dom ],}, for p< «*, are dense. 


4.4. THEOREM (Stewart). There is a CTM N with NE KH(@)). 


Proor. The reader may prove, or trust us, that the notions of being a tree, 
and of level, height, and branch for trees are absolute. (Any proof would 
invoke 1.6(iii) and (iv).) Thus we do not need to distinguish between 
application of the notions inside and outside a CTM. 

Let M be a CTM with MF GCH. Let x = w‘*. Let PE M be such that 
MF P is the KH(x) PO set. Let GC P be M-generic, N = M[G], and 
T = U,cc T € N. We show NET is a « Kurepa tree. Note that by 4.2 
and 4.3 and lemmas of Section 3, all cardinals of M are preserved in N, and 
so kK =o”, w= w). 

If p,q € P are compatible, then either T, is an end extension of T, or 
vice versa. Since elements of G are compatible, it follows that T is a tree, 
and an end extension of all the 7,, p € G. Moreover, any level of T is a 
level of T, for some p € G, hence has cardinality <x inside N. For a < x, 
the set D, defined above (after the proof of 4.3) is dense, so D, N GH O. 
This implies height(T) > a. Thus height (T) = x. 

If p,q EG and p Edom/l, Ndom|,, then by compatibility of p, q, |, (p) 
and [,(p) are comparable in T. (They are both </,(p) for any r = both 
p,q.) If p,o are distinct elements of domi, then I,(p) and [,(@) are 
incompatible in T. (They are distinct elements of the same level of T, the 
a,-th). If a<«, p<w?, and E, is as defined above (after 4.3), then by 
density there is p € D, N E, 1 G, and so |,(p) is defined and of level > a@ 
in T. These observations show that, setting 


B(p)={a ET: IjpEG(pEdompaasi,(p))} forp<w2, 
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B(p) is a branch through T, and B(p)# B(o) for distinct p,o. Thus 
{B(p): p < w3} provides, inside N, a family of > « branches through T, as 
required to show NF KH(x). O 


Simple variants of the above construction give KH(w.), KH(ws), etc. We 
can also get an w, Kurepa tree with w; or more branches, though this 
involves making 2 > w2, since no tree of cardinality x can have more than 
2" branches. A subtler variant is the following: 


4.5. THEOREM (Silver). There isa CTM N with NE (2° > w. A there exists 
an w, Kurepa tree with exactly w2 branches). 


Proor. Let M, P, G, N, T be as in the proof of Theorem 4.4. Thus inside 
N, T is an w, Kurepa tree. We have NF 2° = a, since this holds in M and 
(by closure conditions on P) PX (w) = P™(w). We also have NF 2” = an, 
using GCH inside M and 3.17 (with k =v=o%=w%,A =p =o"=o?). 
Thus NE T has exactly w. branches. 

Now let QEN be {p: p is a function a card’ dom p <= w a dom P C 
w}’X wa range p C {0, 1}}, partially ordered by reverse inclusion. We 
remarked at the end of Section 3 (see 3.16(iii)) that NF (Q is w-closed , Q 
iS w,-AC). We also saw that if H C Q is N-generic, then all cardinals of N 
are preserved in N[H], and N[H]2* =. Since all cardinals are 
preserved, T is still an w, Kurepa tree inside N[H]. To show N[H]F T 
has exactly w2 branches, it suffices to show that any branch through T in 
N[H] already belongs to N. This, in slightly different notation, is the 
content of the following: 


4.6. Lemma. Let M be any CTM, P € Ma PO set, k € M, T E M. Suppose 
MF (k is a successor cardinal n P is A-closed for allA <k A T is a tree A 
height (T)= <a every level of T has cardinality <x). Let GCP be 
M-generic, N = M[G]. Then every branch through T in N already belongs 
to M. 


Proor. For simplicity consider the case k = wi". Let X ={BEM: Bisa 
branch through T}. Assume for contradiction B € N is a branch through 
T and B¢ X. Then B = I, (t) for some t € P“(P X T). Call p good if 
p(t is a branch through T* at ¢ X*). Note that there is a good p € G. 
Note also: 

(i) If p is good and q =p, then q is good. 

(ii) If p is good, p |ka* Et, and b<=a in T, then pit b*Et 
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(iii) If p is good, and pl+a*€t, plt b* Et, then a,b are comparable 
in T. 

(iv) If p is good, and a < x, then there exist q <= p and a € T, such that 
qita*€t. (Since plkK dx € Th (x € t).) 

(v) If p is good, and a <x, then there exist B,a < B <x, and q,r<=p, 
and distinct a,b € Ts, such that ql+a* Et and rlk b*€t. 

To see (v), suppose no such 8, q, r, a, b can be found. By (i}(iv) it follows 
C={a€T:4q <p (q\t a*€ t)}is a branch through T. Moreover C € M. 
(This implicitly uses the Definability Lemma.) But it is easily seen that 
plkt = C*€ X*, contrary to the goodness of p. 

Now all the following is true inside M: We may form for every finite 
sequence s of 0’s and 1’s a good p(s) € P, and a(s)<x«, and an a(s) © Tais) 
such that p(s)|Ik(a(s))* € 4, as follows: For the empty sequence 9, let 
p(®)=any good p,a(@)=0,a(@)=the unique element of To, i.e. the 
minimum element in T. If p(s), a(s), a(s) are defined, then by (v) above 
we can find B,a(s)<B<k«, and p(s0), p(s1)=p, and_ distinct 
a(s0), a(s1)€ T, with p(si)|t a(si)* Et for i =0,1. Set a(s0) = a(s1)= 
B. By (iii) above, a(s) = a(s0), a(s1) in T. Now for every infinite sequence 
S of 0’s and 1’s we can by countable closure find p(S)€ P with p(S)s 
p(S|n) for all n < w. Let a = sup{a(s): s a finite sequence of 0’s and 1’s}. 
By (iv) above, for every S there exist q(S) = p(S) and a(S) € T, such that 
q(S) lt a(S)* € t. By construction a(s0), a(s1) were chosen incompatible 
in T. By (iii), a(S | n) Ss a(S) for all n < w. Hence for $# S$’, a(S) ¥ a(S’). 
But then card T, = 2° =a. 

All this was true inside M, showing card“T, >‘, contrary to 
hypothesis. This contradiction establishes 4.6 and 4.5. O 


The above proof comes from Sitver [1971]. For a different forcing 
argument, see Jecu [1971]. We now turn to another variant of KH. 


4.7. DEFINITION. If T is a tree, a <height(T), and b an element of T of 
level > a@ (or a branch through T), we mean by z.(b) the unique a € T, 
with a = b (resp., with a € b). 

Let « be a regular uncountable cardinal. W(x) is the proposition that 
there exist: 

(1) A « Kurepa tree T, 

(2) a family F of >« branches through T, 

(3) a function W with domain x, 
such that for all a < x, W(a@) is a family of < « subsets of T,, and for any 
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SCF with cardS <x, there exists a <x, such that for any B with 
a<B<xk, {m,.(B): BE SSE W(B). 
Let P be the set of quadruples p = (T,, |, W,, S,) such that: 
(4) (T,, 1) belongs to the KH(x) PO set, 
(5) W, is a function with domain height (T,) = a, + 1, 
(6) W,(a@) is a family of <« subsets of the a-th level of T, for 
a Edom W,, 
(7) S, is a family of <« subsets of dom J. 
Partially order P by setting p <q iff: 
(8) (T,, |.) < (Ty l,) in the KH(x) PO set, 
(9) W, extends W,, 
(10) S,2 S,, 
(11) for every a with a, <a<=a, and every SES,, 


{77 (1,(e)): p € s} © W,(a). 


Thus ordered P is called the W(x) PO set. The proof of the following 
(which closely resembles that of 4.2 and 4.3) is left to the reader: 


4.8. LemMaA. Let « be a regular uncountable cardinal, P the W(«) PO set. 
Then: 
(i) P is A-closed for all A <x. 
(ii) If «* =x for all A <xk, then P is x*-AC. 
(iii) Fora <x, D, ={p: a >a} is dense. For S C«* with card S < k, 
Es = {p: S Cdom 1, a S € S,} is dense. 


4.9. THEOREM (Silver). There is a CTM N with NF W(w). 


Proor. Let M be a CTM with ME GCH. Let « = w. Let P © M be such 
that MF P is the W(x) PO set. Let GCP be M-generic, N = M[G]}. 
Note that all cardinals of M are preserved in N. 

Let TE N be U, cc T,. For p < w= w* let B(p) be the branch through 
T defined in the proof of Theorem 4.4. Let F={B(p): p< w%}. For 
a<wi'=o, let W(a@)= W,(a) for some p € D, NG, where D, is as in 
4.8(iii). This definition of W(a) is independent of choice of p by compati- 
bility of elements of G and the fact that if p,q are compatible, then W, 
extends W, or vice versa. We claim T,F, W are as required to make W(a1) 
true inside N. 

To see this, let S C P‘(F) have card’ S = w. Let S'= {p: B(p) € S}. By 
the closure condition on P and Lemma 3.12, any f € N mapping w onto S’ 
belongs to M. Hence S’€ M. By density of the set Es, defined in 4.8(iii), 
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there is p€G with S’ES,. It suffices to show that if a, <a, then 
{7.(B): B © S}€ W(a). To see this, consider g € D, N G with q = p. The 
last clause in the definition of the order on P guarantees that {7.(B): 
BES}={na(h(o)): pESTE W, (a) = Wa). O 


4.10. Derinition. Let « be an uncountable cardinal. O(«) is the proposi- 
tion that there exists a function C with domain {a < «*: Lim(a@)} such that 
for all a EdomC: 

(i) C(a) Ca and is CUB in a. 

(ii) If cfa <x, then card C(a)<k. 

(iii) If B is a limit point in C(a), i.e. if B <a and C(a)/N B is cofinal in 
B, then C(B) = C(a)N B. 

In case « is singular, the antecedent of (ii) is always fulfilled, and so (ii) 
reduces to: 

(ii’) card C(a) <k. 


We aim to prove that O(@,), O(w.,), etc. are consistent. 

Let P be the set of all functions p with domain {a < a(p): Lim(a)} for 
some limit ordinal a(p)<«*, and satisfying (i}{iii) above for all a € 
dom p. Partially order P by reverse inclusion. Thus ordered P is called the 
C(«) PO set. 


4.11. Lemma. Let x be an uncountable cardinal, P the O(«) PO set. Then P 
is « -distributive. 


Proor. P is not x-closed, but we can prove distributivity directly. First 
consider A < « and open dense D; C P, € < A. We show 1) ,-, D, is dense. 
This proves A-distributivity for all A <«. In case x is regular, the same 
proof works for A =k. In case « is singular, we get «-distributivity by 
3.11(ii). 

Let p € P be arbitrary. We attempt to construct a decreasing sequence 
Pe € =A, in P as follows: Let po = p. At a successor ordinal € + 1, if p, is 
defined, choose p;.:< p, and belonging to D,. At a limit ordinal @, if p, has 
been defined for all €<Z, then f= U,<,p, is a function satisfying 
4.10(i}-(iii) and having as domain the set of all limit ordinals < a, where 
a‘= sup{a(p,): €< g}. Let g be the function extending f with domain 
{8B <a: Lim(B8)} obtained by setting g(a) = {a(p,): € < Z}. If this g still 
satisfies 4.10(i}(iii), set p, = g. Otherwise p, is undefined and our inductive 
construction breaks down. 

If this process never breaks down, then p, is defined and belongs to all 
the D,. Since our original p was arbitrary, this proves the required density 
of Me. De. 
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So assume for contradiction the process does break down at some limit 
ordinal £< A. This can only happen if, in the notation above, setting 
g(a) = {a(p;): € < £} results in a violation of 4.10(i)-(iii). 

But obviously g(a) is cofinal in a. Now if <7 < Z then a(p,;)< a(p,) 
since pp <p, So if B<a is a-limit point of g(a), B has the form 
sup{a (pe): £ <n} for some limit y < ¢. But then by construction a(p,) = 
B, and B € g(a). Thus g(a) is closed in a, and 4.10(i) is not violated. 

Moreover, by construction for limit »<{, g(a(p,)) = pn (a(pn)) = 
{a(pe): <n} = g(a)N a(p,). Thus 4.10(ii) is not violated either. Nor is 
4.10(ii), since card g(a) <carda =A <x. We have reached a contradic- 
tion, which shows that our inductive construction never breaks down. O 


4.12. Lemma. Let «x, P be as in Lemma 4.11. Then for any limit ordinal 
a<«*, D, ={p: a(p)=a} is open dense in P. 


ProoFr. Openness is clear. We prove density by induction on a. If there is a 
greatest limit ordinal B<a@ and Dg, is dense, we have a=B+w. 
Moreover any element p of Dg can be trivially extended to an element q of 
D, by setting q(a) ={B + n: n < w}, so D, is dense. If @ is a limit of limit 
ordinals and D, is open dense for all limits B < a, then D, = MacaDa is 
dense by «-distributivity. 0 


Notice that if we assume GCH, the cardinality of the O(«) PO set is x*, 


so it is trivially «**-AC. 


4.13. THEOREM (Jensen). There exists a CTM N with NE((,), and the 
same is true for O(w2), O(w..), etc. 


Proor. Let M be a CTM with ME GCH. Let k = w/" or w% or what have 
you. Let PEM be such that MF P is the D(x) PO set. Let GCP be 
M-generic, N= M[G], C= UGEN. By the distributivity and chain 
conditions on P, all cardinals of M are preserved in N. It is easily verified 
that C is a function (cf. Lemma 3.2(i)), with dom C = (x *)“. By preserva- 
tion of cardinals, («*)’ = (x*)™. It is then easily verified that C satisfies 
4.10(i}-(iii), recalling that sets have the same cardinalities in M and N, and 
that being CUB is an absolute property. Details are left tothe reader. O 


4.14. Derinition. Let x be a regular uncountable cardinal. E(«) is the 
proposition that there exists a stationary set E C « such that E Ma is not 
stationary in @ for any limit ordinal a < k. 

Let P be the set of pairs p =(E,,a@,) such that a, <x, E, Ca,, and 
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E, Ma is not stationary in @ for any limit ordinal a = a,. Partially order P 
by setting p <q iff a, =a, and E, M a, = E,. Thus ordered P is called the 
E(«) PO set. 


4.15. Lemma. Let « be a regular uncountable cardinal, P the E(«) PO set. 
Then E is \-distributive for all ’ < x. 


Proor. Let A < x, and let D,, € < A, be open dense subsets of P. Let p € P 
be arbitrary. We attempt to construct p, =p with p, © M,<,De proving 
density. We do this by attempting to construct a decreasing sequence 
Pe = (Ez, ae), € = A, as follows: 

Let po=p. At a successor step, if p, is defined, note that trivially 
(Ez az +1) <= pz. Choose pyii = (Ez a +1) belonging to D;. This ensures 
that a, € Ee.1. At a limit step, if pe is defined for all €<Z let a= 
sup{a,: < ¢}, E = U,.,E, If (E, a) € P, set E, = E, a, = a, pp = (E, a). 
If (E, a) & P, p; is undefined and our inductive construction breaks down. 

Clearly p,, if defined, belongs to all D, as required. Assume for 
contradiction that the above process breaks down at some limit ¢ = A. This 
can only happen if, in the notation above, E /M B is stationary in B for some 
limit ordinal B <a. If B <a, then B<a, for some €<¢ and ENB= 
E, 0 B is not stationary in B. So E must be stationary in a. 

By construction, however, E is disjoint from C = {a,;: é < ¢}, since 
a, E,,,. Clearly this C is unbounded in a. If E< yn <G then a, <a,. 
Thus if 8 is a limit point of C, B has the form sup{a,: € < 7} for some limit 
7 < ¢. But then by construction B = a, € C. Thus C is closed in a. E is 
disjoint from the CUB set C, hence nonstationary; a contradiction 
completing the proof. 


4.16. THEOREM (Jensen). There exists a CTM N with NF E(@2), and the 
same is true for E(w), etc. 


Proor. Because the notions of CUB and stationary set degenerate in the 
case of countable ordinals, we do not consider E(w,). Let M be a CTM 
with ME GCH. Let x = 3 or w% or what have you. Let P € M be such 
that ME P is the E(x) PO set. Let GCP be M-generic, N = M[G], 
E = U,.<cE,. By the distributivity and «*-antichain conditions on P (the 
latter being trivial, since card” P = «), all cardinals of M are preserved in 
N. We show E is as required to make E(x) true inside N. 

If p,q € P are compatible, E, = E, a, or vice versa. Since all elements 
of G are compatible, it follows E,=ENa, for all pEG. If D, = 
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{p © P: a, > a}, then it is quite easily seen that D, is dense for all a < x. 
Now consider a limit ordinal a<«. There is p€D,NG. For such 
p,E Na =E, Na. Now E, fa is not stationary in @ inside M, i.e. there is 
a CUB CEM with CN E, = 9. Since this same C is present in N, EQN a 
is not stationary in @ inside N, either. It thus only remains to show that E is 
stationary in « inside N, i.e. for any CUB CCx« belonging to N, 
CNE#®. This turns out to be a bit tricky. 

Any C € P(x) has form Ig (t) for some t € P™(P X x). If C is CUB, 
some po€ G forces that ¢ is CUB in x*. Let G = {(p,p): p © P}, the 
canonical term having Ic (G) = G. It follows that if p € P, a <a,, then 
p |t(a*e U range G) iff a € E,. (Since for any generic H with p € H, 
a € U range In (G)= U,cuE, iff a € E,.) To prove that CN EZ GY, it 
suffices to show that some p € G forces 


(*) tn U range GH 90. 


We show that in fact polt (*), by showing that the set of*p = pp forcing (*) 
is dense below po. And for this it suffices to find for any p = pp aq = p and 
an a such that qlka* Et and a € E,. 

So let p < po be arbitrary. All of the following is true inside M: We can 
define a decreasing sequence q, = (E,, a.) in P, and an increasing sequence 
B, of ordinals <x, for n < a, as follows: Let qo= p, Bo= 0. If q,, B, are 
defined, then since q,=po, q.al+t is unbounded in «*. Hence 
gn IKAB(att+1, BR<B<K*vnBeEt). We can thus choose qiii= 
(En, @n +1) qn and Bair with a, +1, Ba < Bair <« and Qnoilk Bau Et 
Let E'=U,..E, y =supla,:n<w}=sup{B,:n<o}. Then q= 
(E'U{y}, y +1) is = all q, and has y € E,. Moreover, q |t ¢ is closed in 
«*, and q |t B% € ¢ for all n, so q it y* € ¢. Thus we have found a q = p and 
a y €E, such that ql+y* € 4, as required. O 


We close this section with a mention of two more principles. 


4.17. Exercise. If T is an w, Kurepa tree, F a family of branches through 
T, and CCT is of cardinality w,, then {B NC: B € F} has cardinality 
<= w,. (Because CC T, for some a < a2, and BNC is then completely 
determined by the unique a © BN T., and card T, <= w,.) Show that we 
can get a CTM in which there is an w, Kurepa tree and a family F of w; 
branches through T such that for any countable CC T, {BN C: BE F}is 
countable. Do this by considering the suborder of the KH(w2) PO set 
consisting of those p such that for any countable CCT, the family 
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{{a€ C:a<=b}: b€a,-th level of T,} is countable. Show that this 
suborder is still w,-closed, etc., then proceed as in Theorem 4.4. 


4.18. ExeRcIsE (more difficult). Let « be a regular cardinal. O(«) is the 
proposition that there exists a function S with domain «x such that: 

(i) S(@)Ca@ for all a <k. 

(ii) For any A Cx, {a <x: S, = A Na} is stationary in x. Let the O(«) 
PO set be the set of all functions p with domain an ordinal <« and 
p(a) Ca for all a € dom p, partially ordered by reverse inclusion. 

Show that if M isa CTM with MF GCH, x = wo", MF P isthe O(x) PO 
set, and GCP is M-generic, N= M[G], S= UGEN, then S is as 
required to make ©(w,) true inside N. Do this by considering terms 
t,u EM such that Ig (t), Ig (u)C « and I, (t) is CUB. Show how to form 
inside M sequences p, € P, B, € M, B, <« such that p, Ik BX Et, Basi > 
dom pn, Pasilk uN B* = B*. (To get this last, note that by closure P“(B) = 
P™(B) for any B <x. Thus if v € P“(PX B) and X = Y™(B), plkveE 
X*.) Then show any q <all p, will force B*Et and U G(p*)= B*, 
where B is the sup of the B,, B the union of the B,. The argument parallels 
the proof of Theorem 4.16. 


5. Doing two things at once 


Sometimes to construct a model a natural procedure is to start with a 
CTM, M, a PO set Q € M, an M-generic M C Q, and the Cohen extension 
M'= M[H}], and then take a new PO set REM’, and an M’-generic 
K CR, forming the extension N= M'[K]= M[H][K]. We employed 
such a procedure in proving Theorem 4.5. It is also a natural approach to 
proving the consistency of such combinations as 2° > w,a2">2°. We 
show that such double forcing can be reduced to single forcing. We 
consider first the case that R € M' already belongs to M. In this section it is 
necessary to distinguish a PO set P from its underlying set | P|. 


5.1. DeFinition. Let Q, R be PO sets. The direct product Q® R of Q and 
R is the PO set with underlying set |P|=|Q|x|R]| (Cartesian product) 
and partial ordering (q, r) <p (q', r') iff gq <oq' and r<prr'. Note that the 
maximum element 1p of P is thus (1g, 12). 


§.2. First PRopuCT THEOREM. Let M be a CTM, Q,R EM PO sets, and 
P= QOR. Then: 
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(i) If G is an M-generic subset of P, and we set 
H ={q €|Q|: 3r€[R| (qr) € G)}, 
K ={rE|R|: 3q €|Q|((q,r) € G)}, 


then H is an M-generic subset of Q, K is an M[H]-generic subset of R, and 
G=HxK. 

(ii) If H is an M-generic subset of Q, and K an M[H]-generic subset of R, 
then H X K is an M-generic subset of P. 

(iii) If H is an M-generic subset of Q, and K an M[H]-generic subset of 
R, then K is also M-generic, and H is M[K]-generic. 


Proor. (i) Note that if (q, r) € G, then (lo, r) and (q, Ir) € G, since these 
elements are =p(q,r). Thus H = {q €|Q|: (q,1rk)€ G}, K={re|R|: 
(lo, r) € G}. 

To show H M-generic: Consider an arbitrary dense D € P™(|Q]). 
D X|R| is dense in P, hence there is (q,r)€ (D x|R|)N G.ThengE DN 
H, DN HZ 9. 

To show K M[H)]-generic: Consider an arbitrary dense DE 
P™MIAIR|). D=In(t) for some t€ A™%(|Q|x|R|), and some qoE Q 
forces that t is dense. Let E = {(q,r): gq =oqoAqltr* € t}. (E EM by the 
Definability Lemma.) We claim E is dense below (qo, 1x) in P. 

For if (q,r)Sp(qo,12), then qit(t is dense in R*) so qltar'e 
[R|*(r’ser*ar’Et). So there exist q’<oq and r’<pr such that 
q'lF(r')* © t. But (q', r’) Sp (q, r) and belongs to E, proving density. 

It follows (cf. Lemma 2.13) that there is some (q,r)& EM G. Thusr € K, 
and since q € H forces r* € t, r€ D. Thus D N K# O as required to prove 
K generic. 

Finally Hx K CG since if q& H, rE K, then Po= (4,1) and p= 
(lo, r) belong to G, so there is p’=(q',r')E G with p’<ppo, pi. Clearly 
(q,r)2pp', so (q,r)€ G as required. The inclusion GC Hx K is im- 
peta: completing the proof of (i). 

(ii) will not be used below, and its proof is left to the interested reader. 
(iii) follows from (i), (ii), and the fact that Q@R and R@Q are 
isomorphic. O 


The case of double forcing with R’€ M'- M is more difficult. We can 
make some simplifying assumptions: (i) We assume |R| is an ordinal p. 
Hence | R| € M, since OR” = OR™. (ii) We assume the maximum element 
1p of R is just the ordinal 1. Any PO set is isomorphic to one satisfying 
these conditions. 
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The order relation =z of R has the form I(t) for some term 
t € P™(|Q|x p X p). Moreover, some q € H forces that t makes p* a PO 
set with maximum element 1*. Applying Corollary 2.17, we can get a term 
R such that 


* Q\+ R makes p* a PO set wih maximum element 1* 
(*) p 


and q | t = R, whence I, (R) = I(t), i.e., In (R) is the order relation on 
R. A term satisfying (*) we call good. 


5.3. DeFIniTIoN. Let M be a CTM, QE™M a PO set, pE OR”, X= 
P™(|Q|xp), R EM agood term. Let P, = {(q,7)E|Q|x X: qihreo". 
For elements of Po define (q, 7) <o(q’, 7’) iff g Sqq' and q |k F <7’ in R, i.e. 
(r,r’)€ R. The relation <p will not in general be antisymmetric, but 


(**) (4, F) S0(q',?") and (q’, fF’) So(q, r) 


is easily seen to define an equivalence relation. In fact (**) amounts to 
saying that q = q’ and q |t (7 = 7’). Let [q, 7] be the equivalence class of 
(q,¥). Let |P| be the set of all such equivalence classes, and define 
[q,7] =p [q',r'] iff (q,7) <0(q',r’). (This definition is easily seen to be 
independent of choice of equivalence class representatives.) Then P = 
{|P|, <r) is a PO set with maximum element 1p = [1lo, 1*]. P is called the 
forcing product of Q and R (with respect to M), and we write P= Q@R. 


5.4. SECOND PRoDuUCT THEOREM. Let the notation be as in 5.3. Let G be an 
M-generic subset of P. Set 


H ={q€/|Q|: aFE X [g, F] € G)}, 
K = {In (7): 3g €|Q|((q, 7] € G)}. 


Then H is an M-generic subset of Q and K is an M[H]-generic subset of the 
PO set R =(p,Iu(R)). 


Proor. We leave the proof that H is M-generic to the reader. To show K 
M{[H]-generic, consider an arbitrary dense subset D of R. D = I, (t) for 
some t € X, and there is qo€ H which forces that t is dense. We claim 

= {[q, 7]: qt F © t} is dense below [qo, 1*] in P. For let [q, r] <p (qo, 1*]. 
ce q\t t is dense, qilt Jo € p*(o<rin Rao €t). Hence there exist 
q' oq anda € p such that q’|t o* € ¢. Thus [q’, o *] <p (q, 7] and belongs 
to E, proving density. It follows that there is some [q,7] €@ EMG. Then 
I, (7) € K, and q € H. Since, further, q(t 7 € 4, In (F)€ D. Thus DO H#A® 
as required. ( 
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Theorem 5.2(ii) reduces double forcing to single forcing. Theorem 5.2(i) 
enables us to regard a single forcing as a two-stage affair in certain 
circumstances. The usefulness of this result is illustrated by Theorem 5.7 
below. For the rest of this section we can revert to our old abuse of 
notation, not distinguishing between a PO set and its underlying set. No 
confusion should result. 


5.5. DEFINITION. Let « be an inaccessible cardinal. The x-Collapsing PO 
set is 


P = {p: (p is a function) a (dom p is countable a dom p C k X w, A 


AW (a, €) € dom p (p(a, €)< a)}, 


partially ordered by reverse inclusion. For a < x, let P, be the suborder of 
P {pE€P:dompCaxXw,}, and P* the suborder {p € P: dompC 
(x — a) X w,}. Then P, @ P* and P are isomorphic under the map sending 


(q,r) tog Ur. 


5.6. LEMMA. Let « be an inaccessible cardinal, P the x-Collapsing PO set. 
Then P is k-AC. 


Proor. Let A C P have cardinality x. We show A is not an antichain. To 
do this we define for € < w, sets A; C A with card A; < x, as follows: Let 
Ao={a} for some arbitrary a€ A. If A; is defined for <7, let 
Dy =U; U,e,,dom p. By regularity, card D, < «x, and D, Ca X a, for 
some a < x. Thus B, = {p|D,: p € A} has card B, <= (card a)*"* < x. So 
we can pick A, D U,., Ag with A, € A and card A, <x such that for 
every q € B, there is p € A, with p| D, = q. Finally, let A’ be the union of 
the A, é<,, and D the union of the D,;. Card A'<k, so there is 
p€A-A'. Domp is countable, so dompM DCD, for some € < a. 
Hence for some q € Agi C A’, p|(domp ND) = q| D,. Since domq C D, 
p and q agree on dom p M dom gq. Thus p U q is a function extending both p 
and gq, and A is not an antichain. O 


Collapsing PO sets were introduced by Lévy and used by him and by 
Solovay in many impressive results. (See SoLovay [1970].) The example 
given here (Theorem 5,7) is due to Si-veR [1971]. Just for this result we 
assume the existence of a CTM M with MF there exists an inaccessible 
cardinal. By methods of Gédel (mentioned above after Theorem 3.14), 
having one such M we can get one with MF GCH. 
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5.7. THEOREM (Silver). There exists a CTM M with M™ 4 KH(a)). 


Proor. We are assuming that there exists a CTM M and a k EM such 
that MFE« is an inaccessible cardinal. As remarked, we may take 
MFEGCH. Let P € M be such that MF P is the x-Collapsing PO set. Let 
G CP be M-generic, N = M[G]. By the (trivial) countable closure of P 
inside M and the x-AC, all cardinals in M < w" or =x are preserved in 
N. But it is easily seen that for a <k, 


{(& B): ap © G ((a, €)E dom p v p(a, €)= B)J}EN 


is a surjection from w{" to a. Thus there are no cardinals in N between 
w= and k, i.e. kK = Ww). 

Any tree of height w, whose levels are all countable is isomorphic to a 
tree with underlying set | T| = ,. 

Suppose T € M is a tree with underlying set w/*. Let F be the set of all 
branches through T in M. Then card” F < w% and card’ F <card™ w= 
wi". Moreover, by Lemma 4.6, every branch through T in N already 
belongs to M. So in N, T cannot be a Kurepa tree. 

Now suppose more generally T € N is a tree with underlying set w{*. Let 
t€ P™(P X wi" w!") be such that Ig (t) is the order relation <r of T. 

All the following is true inside N: If € <r7, then there is p € G with 
(p, én) Et. Let f(é 7) be one such p. If £7 < wf and not é <r7y, then 
there is p € G such that for no q =p do we have (q, é,n) € t. (Viz., any p 
forcing (€*, n*) € t.) Let f(& 7) be one such p. By regularity of x (= w?), 
there is a <« such that dom f(é 7) Ca X w@ for all & 7. 

Now P= P, X P*, and Theorem 5.2 implies that G. =GONP, is M- 
generic, and G* = GN P* is M[G.]-generic. Moreover M[G,][G*] = 
M[G]J=N. All the f(é7)€ G., so <r is Ig (tN (P. X wi"X wi"), and 
TE M[G,]. 

Let v be (carda)** in the sense of M. Since « is inaccessible inside 
M,v<-k. It is easily seen that card” P, < y, so trivially ME P is v-AC. 
Thus all cardinals =v are preserved in M[G.]. By an easy computa- 
tion using Theorem 3.15 M{G,] 2" = v, where » = wi". Hence if F is 
the set of all branches through T in M[G.], card™'%! F< v<x, and 
card’ F = w!. 

Now MF P.,, P* are w-closed. By w-closure for P,, there are no 
countable sequences of elements of P* present in M[G,] except the ones 
already present in M. Hence M[G,]- P* is w-closed. Thus we can apply 
4.6 to conclude that any branch through T in N = M[G,][G°] was already 
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present in M[G, ], i.e. belongs to F. Thus T is not a Kurepa tree inside N. 
Since T was arbitrary, NEF 4KH(@,). O 


We will soon see that Theorem 5.2, which was designed to let us do two 
things at once, actually lets us do infinitely many things at once. First a 
crucial lemma. 


5.8. Lemma. Let M be a CTM, O,R EM PO sets, P= Q®R, «x acardinal 
in M. Suppose 


MF (k is regular AWA < x (QisA-closed) , Ris k-AC). 


Then p |t « is cardinal. 


Proor. Let GC P be M-generic. Let HC Q, K CR be as in S5.2(i). We 
must show « is preserved in M[G] = M[H][K]. Assume for contradiction 
that A <« and f € M[G] is a surjection from A to x. f = Ig (t) for some t 
P™(P XA XK), and some po = (qo, ro) forces that ¢ is a surjection from A * 
to «*. 

All the following is true inside M: For B < k, q =aqp, let us say a set 
ACk £-supports q if whenever (q',r')=p(q,r) and a<« and 
qit t(B*)=a*, then aE A. Let Dg ={qEQ: 3A Cx (cardA<KaA 
B-supports q)}. We claim each Dg is dense below qo. 

Still inside M, we prove this claim as follows: Let q So qo be arbitrary. 
We form, for some {<«k to be determined, a decreasing sequence 
qo€&=¢, in Q, and ry, ER, a <x, for 0<é<{, as follows: Let qi = 4q, 
r= Po, a, =0. If 7 <x and qm, a, are defined for 0 < € < n, then by the 
closure condition on Q, there is qg' So all q¢. If {a,: 0< & <n} B-supports 
q', then we have found a q’=oq in Dg, and we may stop, setting q, = q’ 
and ¢ = ». Otherwise, there exist (q,, 17) Sp (q’, %o) and a, <« such that 
(qn, Tm) | (B*) = a@*. (Since (q’, ro) Ik da < x* (t(B*) = @).) Note that for 
distinct é the (q:,r-) thus constructed are incompatible. (They force 
contradictory things about ¢(6*).) Since the q, form a descending se- 
quence, the r,; must be incompatible. Since R is k-AC, the construction 
must stop at some stage ¢ < x, yielding q, =qq in Ds, as required to prove 
density. 

The following is also true inside M: Q is A-closed, hence A -distributive. 
It follows D = (\,-, Dz is dense below qo. For q € D, let A(q, B) be for 
each B <A, a set of cardinality <x« B-supporting q, and let A(q)= 
Mp, A(q, B). Card A(q)<« by regularity. 

All this was true inside M. Now from the density of D below qo€ H, it 
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follows there is gq © D1 H. Since card’ A (q)< xk, there is a € x — A(q). 
Now a =f(B) for some 8 <A. But then there is (q',r’)E G forcing 
t(B*)=a*, and we may as usual take it <p (q,r)€ G. But then a€& 
A(q, B)G A(q), a contradiction which completes the proof. O 


5.9. THEOREM (Easton). There isa CTM M with MF Wx < a, (2" > «”*). 


Proor. For n <a, let P, be 


{p: p is a function a card dom p < a, A 


dom p G @,+2 X w, A range p C {0, 1}}, 


partially ordered by reverse inclusion. Let Q, be the set of all sequences 
(Po, Pi‘** Pn) With p,€P, partially ordered by setting (po--:p,)< 
(qo'** Gn) iff p; = qi in P, for all i. Let Q” be the set of all infinite sequences 
(Pn+1; Pn+2,---) With p; © P,, similarly ordered. Assuming GCH, it is not 
hard to see, using the Lemma 3.6, that each Q, is w,.:-AC. Clearly each 
Q" is w,-closed. Equally clearly Q,,@Q™ and Q, @ Q” are isomorphic 
for all m, n. We call Qo@Q° the Easton PO set. 

Now let M be a CTM with MF GHC. Let P € M be such that MF P is 
the Easton PO set. Let GCP be M-generic, N = M[G]. The analysis 
Easton PO set = Q,@Q", together with Lemma 5.8, shows that every 
cardinal of M is preserved in N. It is not hard to see (cf. 3.16(iii)) that inside 
N we get from G/N P,, a family of w,.2 subsets of w,. Thus N is the sort of 
CTM required to prove the theorem. O 


Theorem 5.9 admits of countless refinements. For the last word, see 
Easton [1970]. 


6. Martin’s Axiom 


Martin’s Axiom (MA) is the proposition: 

If Pisa CCC (i.e. w:-AC) PO set, and Fa family of < 2° dense subsets of 
P, then there exists an F-generic subset G of P. Note that CH implies MA by 
Proposition 2.3. Many important consequences of CH in fact follow from 
MA. MA is most interesting, however, when we have 2° > w,. We will 
prove here the consistency of MA plus 2° = w,. Chapter B.6 contains many 
applications of MA. 

We begin by reducing MA to something apparently weaker. 
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6.1. PRoposiITION. Suppose that for every CCC PO set P of cardinality <2° 
and every family F of <2° dense subsets of P, there exists an F-generic 
subset G of P. Then MA holds. 


Proor. Let P be an arbitrary CCC PO set, F a family of <2° dense 
subsets of P. Let f: PX P—P be a function such that for compatible 
p.q€&P, f(p,q)<ep and f(p,q)<»q. For DEF, let fo: P—D be such 
that fp (p) <p for all p. Let Q be the smallest suborder of P containing 1p 
and closed under f and all the fp. Then card Q < 2°. By closure under f, 
elements of Q which are compatible in P are still compatible in Q, hence Q 
is CCC. By closure under the fp, DM Q is dense in Q for all D € F. By 
assumption, therefore, there exists a {DM Q: D € F}-generic H C Q. It is 
easily seen that {p € P: dq © H(q <rp)} is an F-generic subset of P. Since 
P, F were arbitrary, this proves MA. O 


Let us call a PO set P normalized if its underlying set is either w or w,; 
and its maximum element is the ordinal 1. Clearly every PO set of 
cardinality <, is isomorphic to a normalized PO set. Since we will 
be working extensively with such sets, it becomes necessary again to 
distinguish sharply between a PO set P and its underlying set | P|. Thus 
‘“‘D is a dense subset of P” means literally that D C| P| and D is dense in 
the order <p. 

Let us outline the procedure, taken from SoLovay and TENNENBAUM 
[1971], which we will use to construct a model of MA and 2° = w,. We saw 
in Corollary 3.16(i) that there exists a CTM M with MF 2° = 2” = w. If M 
is such, we say (Q, F) € M is a counterexample to MA in M if Q is a PO 
set, ME(Q is normalized and CCC), F is a family of dense subsets of Q, 
card” F < w‘, and there exists no F-generic subset of Q in M. MF MA iff 
no such counterexample exists. (Cf. Proposition 6.1, Lemma 2.12(iv).) 

If (Q, F) € M is a counterexample to MA in M, and H is an M-generic 
subset of Q, then a fortiori H is F-generic. Thus (Q,F) is not a 
counterexample to MA in M[H]. Note that by CCC, M and M[H] have 
the same cardinals. Thus forcing can destroy one counterexample to MA. 
But probably in M[H] there are counterexamples left over from M which 
were not destroyed by adjoining H, and probably also new counterexam- 
ples in M[H]—- M. If (Q’, F’) is one of these counterexamples in M[H], we 
can of course take an M[H]-generic subset H’ of Q’ and form a model 
M[H}][H'] in which neither (Q, F) nor (Q’, F’) is a counterexample to 
MA. This double forcing could even be reduced to single forcing using the 
Product Theorems. But clearly if we are going to get a model with no 
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counterexamples, we are going to want to adjoin infinitely many generic 
sets. This will require much more sophisticated iteration techniques, to 
which we now turn. The point of the lemmas below (6.2 and 6.5) is that we 
can perform elaborate constructions on PO sets without losing CCC. The 
lemmas come from SoLovay and TENNENBAUM [1971]. 


6.2. LEMMA. Let M be a CTM, QEM a PO set, pE OR”, RE 
P™“(|O|Xpxp) a term such that Q\IK(R makes p* a PO set with 
maximum element 1*), P= QQ@R. Then if MEQ is CCC, and QI(R 
makes p* a CCC PO set), then ME P is CCC. 


Proor. R is what we called in Section 5 a good term. Let us write « for 
w{". Assume for contradiction that there exists inside M a sequence [q,, 7: ], 
é <x, of pairwise incompatible elements in P. For € <7 <x, either qe, qn 
are incompatible in Q, or else every q <oqeq, forces that %,7, are 
incompatible in the order R. (For if q Soe, q, forces do € p* (a is below 
both 7, and 7, in R), then there exist g’<oq and o € p such that q' It (o* 
is below both 7, and 7, in R), i.e. [q’,o*] <p both [qr] and [q,, tm], 
contrary to the incompatiblity of these elements.) All of the following is 
true inside M: We may without loss of generality assume that Q forces (i.e. 
that 1g forces) 


(*) i, 7, € p* > 7, 7, are incompatible in the order R 
whenever E< 7 <xk. 


For if not, fix a term 7, e.g. p*, such that 19 Ik FZ p*. Let 7;, € < x, be terms 
such that q, lt 7,;= 7 and any q incompatible with q, forces 7;=7. 
(Corollary 2.17 gives us such 7%) We claim Q forces (*) with the 7; 
replacing the f;. To see this it suffices to show for each pair € <4 <x that 
the set of q forcing (*) with 7;, 7) replacing %;, 7, is dense. So let g E|Q| be 
arbitrary. There is g’<oq such that either q’ is incompatible with one of 
44m, or else q’ <0 qe, qn. In the former case, q'lt 7-€ p* or q' it F,€ p*; 
while in the latter case q’lt7;= 7% and 7,=7,, whence q'lt F:,7;, are 
incompatible. So in either case we have found q’=oq forcing (*) with 
r,, 7, replacing *;,7,. So the set of q forcing this is dense, and Q forces it, 
and we may as well assume Q forces (*). 

Still inside M we have: There is a q in Q forcing Vé<x*4y > 
(7 € p*). For assume to the contrary every q forces JE<xK*Vn > 
é(7,£p*). Then D={q: dé <x (q\l- Vn > E*(7,E p*))} is dense (by 
2.10(v)). By Zorn’s Lemma there is an antichain A in Q maximal with 
respect to the property A C D. It is easily seen that A is actually a maximal 
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antichain in Q (by density of D). Let f: A—v« be a function such that 
q\it Vn > (f(q))* (7, € p*) for all g € A. Since Q is CCC, A is countable. 
Since «x is regular, é =sup{f(q):q¢€ A} is <«. Every qEA forces 
Vn > €*(F, € p*), hence Q forces this (by 2.15(ii)). This contradicts the 
fact that qe+1 lt 41 € p*. This contradiction shows that there is indeed a q 
forcing VE <«*4n >é (% Ep*). 

Since Q|+(*) above and (by CCC for Q) also forces that «* is a 
cardinal, such a q forces that {7,: 7 <«*~ 7, € p*} is an uncountable 


antichain in R, contrary to Q|t(R makes p* a CCC PO set). This 
contradiction completes the proof. © 


6.3. DEFINITION. Let P be a PO set, Q a suborder of P. A map 
h:|P|—|Q| is a retraction of P to Q if 

(i) For all p€ P, p=h(p). 

(ii) For all qg © Q, q = h(q). 

(iii) Whenever p'<p, then h(p')<h(p). 

(iv) Whenever h{p')<h(p), then there is p" =p with h(p”)=h(p’). 

(v) Whenever h(p) and q € Q are compatible in Q, then p and q are 
compatible in P. 

It is not necessary to distinguish <p» and <g in stating this definition 
since Q is a suborder of P. 


6.4. COROLLARIES TO DEFINITION 6.3. (i) The property of being a retraction is 
absolute. 

(ii) If h is a retraction of P to Q, and k a retraction of Q to R, then khisa 
retraction of P to R. 

(iii) If there exists a retraction of P to a suborder Q of P, then any 
q,q'€|Q| which are compatible in P are compatible in Q. 

(iv) If M is a CTM, PE Ma PO set of the form Q®R, then the map 
h((q, r]) = [q, 1*] is @ retraction of P to a suborder Q' of P isomorphic to Q. 
This h is called the natural retraction. 

(v) If Misa CTM, P,Q EM PO sets, h € Ma retraction of P to Q, then 
for any M-generic subset G of P,GM|Q| is an M-generic subset of Q. 
Further if @(x) is an absolute formula, and t€ M a Q-term such that 
Q\t e(t), then Plt g(t). 


Proor. (i) could have been added to the list 2.12. We leave it to the reader, 
along with (ii). 
(iii) Suppose p € | P| is < q, q’. By 6.3(ii) and (iii), h(p) E | QJ is <q, q’. 
(iv) Q' contains all elements of P of form [q, 1*] and is isomorphic to Q 
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under the map sending q to [q,1*]. We leave verification of 6.3(i)}{iii) to 
the reader. 

To see 6.3(iv): Let p = [q, 7], p' = [q’, 7’], and suppose h(p’) = rh(p), i.e. 
q'=oq. Then p” = [q’,7'] is defined, p”<pp, h(p")= h(p’). 

To see 6.3(v): Let p =[q,7], p’=[q’,1*], so that p’ is in Q'. Suppose 
h(p) =[q, 1*] and p’ are compatible in Q, say p” = [q”, 1*] is below both. 
Then q” <oq, q’, and p” = [q", F] is defined and is below both p and p’. 

(v) Let G be an M-generic subset of P, and consider an arbitrary dense 
subset D of Q belonging to M. We claim E = {p €| P|: h(p) € D} is dense 
in P. For let p be any element of P. There is some q € D with q = h(p). By 
6.3(iv), there is p'< p with h(p') = h(q) = q, so p’€ D, proving density of 
E. By genericity of G, there exists p€ EMG. Then h(p)€ D, and since 
p=h(p), h(p)€ G. Thus DN G# Y, proving genericity for GN|Q|. 

Note that trivially any Q-term is also a P-term. Further if G is an 
M-generic subset of P, Ic (t)= Icnia\(t). Thus if Q lk y(t), where ¢ is 
absolute, then for any generic subset G of P, M[G N|Q|]F ¢Ue (t)), and 
by absoluteness y(Ic(t)) is true and true inside M[G], showing that 


Pro(t). O 


It will now prove convenient to modify the definition of the forcing 
product Q ® R so as to make Q itself (and not the isomorph Q’ mentioned 
in 6.4(iv)) a suborder of Q@R and to make the natural retraction a 
retraction to Q. This can easily be accomplished, by expelling from Q ®R 
every element of form [q,1*], putting q in its place. 


6.5. LEMMA. Let A be a limit ordinal. Let P,,&=4A, be an increasing 
sequence of PO sets such that for limits n = A, P, is the union of the P,, € < 7. 
Let hz & < A, be retractions of P, to P, such that for <1 <A, he =hh,. 
Then if each P,, € <A, is CCC, so is P,. 


Proor. For the definition of the union of an increasing sequence of PO 
sets, see Definition 2.1. We leave it to the reader to verify that for 
&<7 <A, the restriction of h, to |P,| is a retraction of P, to P;. 
Assume for contradiction that A is an uncountable antichain in P,. For 
&<A,P, is CCC, so A N|P,|, being an antichain, is countable. It follows 
cfA >. We define an increasing sequence é(n) of ordinals <A and a 
sequence B(n) with B(n)C Pen, for n<, as follows: Let €(0)=0. 
Having &(n), let B(n) be an antichain in P,,,, maximal with respect to the 
property of being a subset of {hen(p): p€ A}. Necessarily B(n) is 
countable. Since P, is the union of the P,, € < A, for each q € B(n) there is 
an n(q)<A with q =he.(p) for some p €|Pyq)|M A. Since cfA > a, 
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there is an ordinal €(n+1)<A which is >&(n) and >7(q) for all 
q€& B(n). 

Now let 7 =sup{é(n): n< w}<A. As AN|P,| is countable, there 
exists some p € A —|P,|. Let p’ = h, (p). Since P, is the union of the Pein), 
there is an n with p’E|Pe.)|. For € with (n) = € = n, he (p) = heh, (p) = 
h, (p') = p’. By maximality of B(n), p’= hemp) is compatible with some 
q € B(n). Now q itself has form he(r) for some rE A N| Pewsy| by 
construction. 

By 6.3(v), since p’ € | Pey| and q = hgwy(r) are compatible in Pein), p’ and 
r are compatible in P.(,+1. Again, since r © | Pe+y| and p’ = hgn+(p) are 
compatible in Pga+1, 7 and p are compatible in P,. But r, p € A, so A is not 
an antichain, a contradiction. O 


6.6. THEOREM (Martin). There existsa CTM M with M F (MA a 2° = a»). 


Proor. We have already hinted at the idea of the proof. We now give 
details. Let M be a CTM with M F 2° = 2” = a. Write A for wand « for 
w3'. Recall the definition (given just after Proposition 6.1) of a normalized 
PO set. , 

Inside M all the following is true: If P is a CCC PO set with card P <= k, 
then P| (A*,«* are cardinals a 2*°=x*). (This uses 3.4, 3.15.) Hence 
P+ (there exist only «* normalized PO sets). Hence we may (by 2.16) 
choose terms U,(P), Ui(P) such that U,;(P)C PX « X y X y, where y = 
or A according as i = 0 or 1, and PF (U, (P) is a function with domain «* 
and range the set of all order relations of normalized PO sets with 
underlying set y*). Let a, b,c be functions with domain « — {0} such that 
for all E<«, a(€)<& b(E)<xk, c(€)=0 or 1, and for all a, 8B <x and 
i<2, there exists a &<x« such that a(é)= a, b(&)= 8B, c() =i. 

Still inside M: Using the functions a, b,c for bookkeeping, and also the 
functions Uo, U1, we try to define an increasing sequence P;, = x, of PO 
sets, each of cardinality <«, and each with maximum element 1. In order 
to carry out the construction, we construct at the same time retractions h? 
of P, to P; for  < 9 =x, such that hf= h?hé when <7 <¢ <x. Let Py 
be any normalized CCC PO set with underlying set w, and P, any such PO 
set with underlying set A. At a successor n +1: Suppose everything 
required is defined up to and including stage y. Consider a = a(n), 
B= b(n), i=c(m), and y =o or A according as i=0 or 1. Let u= 
U, (P..), and let v be a P,,-term such that P, |k v = u(B*). Thus P |t (v is the 
order relation of a normalized PO set with underlying set y*). Note that v 
is also a P,-term and P, forces the same thing about v (by 6.5(v)). Let R, 
be a P,-term such that P, ik ((v is CCCa R, = v) v(v is not CCCa R, is 


450 BURGESS / FORCING [cH. B.4, 86 


the order relation of P*)). So in both cases P,, |t (R,, is a normalized PO set 
with underlying set y*). (Such R, exists by Theorem 2.14.) Let P,+:= 
P, @R,. Pas: is thus CCC (by Lemma 6.2). We will verify below that 
card P,., <= «. Let h be the natural retraction of P,., to P,. Let h?*! = h for 
€=n, and h?h for <7. Ata limit {= «: Let P, be the union of the 
P,, € < ¢, assuming that these are defined. For é < @ let ht be the union of 
the maps h? for <<. It is easily checked that these maps are 
retractions. The existence of such retractions implies this P,; is CCC (by 
Lemma 6.5), and of course card P, <= kx. 

Still inside M, we carry out the verification that P,,,, as defined above has 
cardinality < «x. The argument resembles the proof of Theorem 3.15. Since 
card P, <x, it suffices to show for fixed q © P, that there are only « 
equivalence classes [q, 7] in P, ® R,. Fix such [q, 7]. For each o < y, we can 
find an antichain A(o,7) below q in P,, maximal with respect to the 
property that every q'€ A (o, F) either forces a* € 7 or forces a* € 7. The 
usual argument shows that the A(o,r) are actually maximal antichains 
below q in P. Let B(o,7)={q'E A(a,7r): q'lkao*E7}, Clo,F)= 
A(oa,7)— B(o, 7). Now every antichain is countable, so as we vary 7, we 
only get «°’ = « different pairs of sequences B(a, 7), C(a,7), a < y. So it 
suffices to show that if B(o,7)= B(o,7’) and C(o,7r) = C(o,7’) for all 
o<y, then [q, 7] =[q,7’], ie. q\k 7 =7'. Well, suppose q does not force 
r=rf'. Since qlkr, r'€ y*, there must exist q’=q in P, and a < y such 
that q lk (o* EF a o* <7’) or vice versa, say the former. By maximality 
there is gq" € B(a,7)U C(o,7) compatible with q’. Clearly q" € B(a, 7). 
But q"¢ B(o,7'), equally clearly. So B(o,7)# B(o,7’). Thus [q,7] is 
completely determined by the sequences B(o,7), C(o, 7), o < y, complet- 
ing our verification. 

Now all the above was true inside M, so we have constructed a 
P=P,EM such that MF P is a CCC PO set. Let G be an M-generic 
subset of P, N = M[G]. By CCC, M and N have the same cardinals, and 
N2°=2* =x. To show NEMA, it suffices to show that for any 
(Q, F) € N such that N = (Q is a normalized CCC PO set a F is a family of 
<= dense subsets of Q) that there is an F-generic subset K of Q 
belonging to N. (Genericity is absolute by 2.12(iv).) For definiteness, let us 
take the case that the underlying set of Q is A rather than o. 

Now for <x, G; = GN|P,| is an M-generic subset of P; (by 6.5(v)). 
Let M, = M[G,]. Thus M C M, CN, and M and N and M, have the same 
cardinals. We begin by showing (Q, F)€ M, for some a < x. The proof 
resembles a step in the proof of Theorem 5.7. Fix t€ A“ (P X y X y) with 
Ig (t) being the order relation on Q. 
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All the following is true inside N: If a, 0 < A, and o <g7, then there isa 
p © G with (p, o, r) € t. Let f(a, 7) be one such p. If on the other hand we 
do not have o So7, then there is p€G such that for no q <=pp is 
(q, 0,7) € t (viz., a p forcing (o*, 7*) € t). Since x > A is regular, all f(o, 7) 
belong to G, for some a < k. 

All this was true inside N, showing that the order relation of Q equals 
Ic, (t)€ M. for some a<x, hence QEM,. To handle F, we fix a 
surjection f€ N from A to F, and apply a similar argument to E = 
{(o,T)E A XA: o E f(r)}, to show that E and hence F belongs to M, for 
some a <x. Finally, fix a single a <x such that (QO, F)€ M.,. 

Now Q is, in M,, a normalized PO set with underlying set A. If 
u = U,(P,) is the term such that Ic,(u) is a surjection from « onto the set 
of all order relations of such PO sets, then the order relation of Q is 
Ue, (u))(B) for some B < x. Let v be the P,-term with P, |t v = u(B*), so 
Ic,(v) is the order relation of Q. Take 7 <x with a(n)= a, b(n)= 8, 
c(m)=1. Since there are no uncountable antichains in Q inside N, a 
fortiori there are none inside M,. Now in the construction of P,., given 
above, R,, was a term so chosen that for any M-generic subset H of P, we 
would have I, (R,,)=In(v) if the latter is CCC inside M[H], and is 
something else otherwise. So in our particular case, Ic,(R,) = Ic,(v) = 
Iz,(v) is the order relation on Q. 

Finally, consider the M-generic subset G,+, of P,+1=P,@R,. By 
Theorem 5.4, 


K = {Ie,(r): 4q © G, ((q, 7) © Gyar)} © Masi CN 


is an M,-generic subset of the PO set with order relation Ic,(R,), i.e. 
of Q. Since FC M, CM,, K is a fortiori F-generic, and our proof is 
complete. 0 


Simple variants of the above argument give models with MA and 
2° = ws, etc. 

SoLovay [1966] initiated the use of forcing to prove ordinary mathemati- 
cal theorems (rather than consistency results). Unfortunately we lack space 
to enter into a discussion of his methods here. 
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1. Introduction 


The notion of constructibility in set theory was first introduced in GODEL 
[1938], and was used to establish the non-refutibility (in ZF — 
Zermelo—Fraenkel set theory without the axiom of choice — and in ZFC 
— Zermelo-Fraenkel set theory — respectively) of the axiom of choice and 
the generalised continuum hypothesis (GCH). The motivation for the 
definition of ‘‘constructibility” is roughly as follows. We are working in ZF. 
The universe, V, of all sets is ‘obtainable’? by commencing with the null 
set, @, and iterating the power set operation. In this way we obtain the 
familiar cumulative hierarchy: 


Vo =O; 
Vv. = U{P(v,)| B <a}. 


And we have 
Vv = Ween Vay 


where On is the class of all ordinals. (We use here the usual notations and 
conventions of set theory, so, in particular, an ordinal number is equal to 
the set of all smaller ordinal numbers, and a cardinal number is a special 
kind of ordinal number.) 

The reason why certain questions about the cumulative hierarchy are 
not answerable in ZF or ZFC, so the argument runs, is that the notion of 
the power set of an infinite set is too vague; we know that P(x), the power 
set of x, consists of all subsets of x — but what does all mean here? The 
axioms of ZF and ZFC do not help us much. The constructible universe is 
obtained when this looseness is removed by taking the power set of any set 
as small as possible, without contradicting the ZF axioms. More precisely, 
we notice that any subset of a given set which is first-order definable (in the 
language of set theory, Y) from other given sets must “exist” (in any 
“universe’’) if the given sets ‘“‘exist’’, and define the constructible hierarchy 
(with the constructible universe as its limit) by taking, at stage a, not all (?) 
subsets of what we have so far, but only those subsets which are first-order 
definable from what we have so far. This minimality of the constructible 
universe has the result that for any cardinal x, 2" is as small as possible 
(hence the GCH holds in the constructible universe). And because every 
constructible set can be, in a certain sense, uniquely “named’’ in terms of 
previously introduced sets, the axiom of choice also holds in the construct- 
ible universe. 
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2. The constructible universe 


Let X be any set. By Def(X) we mean the set of all subsets a of X which 
are first-order definable over the structure (X, € , (x).ex). (See Section 3 of 
Chapter A.1.) More precisely, for any set X, Lx denotes the language 
obtained from & by introducing a new individual constant X, to denote x in 
the structure (X, ©), for each x € X, and Fx denotes the satisfaction 
relation for &x-sentences in the structure (X, €) (with the standard 
interpretation of the constants), whence Def(X) is the set of all a C X such 
that for some formula ¢(vo) of Yx, with free variable vo only, a= 
{x EX [Fx @(4)}. 

By recursion on the ordinals, we define the constructible hierarchy as 
follows: 


La+1 = Def(L.); 


L= UL, if lim(A) 
aca 

(i.e. if A is a limit ordinal). (To see that this really is parallel to the 
definition of the cumulative hierarchy, notice that V..i1= PA(V.) for all a 
and that V, = U,-,V.. if lim(A).) 

The constructible universe is the class L = U.eonLz. A set x is construct- 
ible if x EL (i.e. if there is an a such that x €L,). 

The following lemma is almost immediate from the definitions. 


2.1. LEMMA. (i) a < B implies L, U{L.}C Leg; 
(ii) Each L, is transitive. L is transitive. 
(iii) For alla, LNa =L.Na=a anda ELys;. OnCL. 


As hinted at in our introductory discussion, L is a suitable ‘“‘universe”’ for 
set theory; i.e. L is an inner model of ZF. (An inner model of a theory T is 
a transitive class M such that On C M and for each axiom ¢ of T, 9”™, the 
relativisation of g¢ to M, is valid.) 


2.2. THEOREM (Gédel). For each axiom ¢ of ZF, ZF to”. 


Proor. (In ZF) Extensionality. We must show that Vx Vy [Wz(z& 
x<z€y)x=y] holds in L; ie. that for any x,y in L, x= 
yoVzEL(z€x<z Ey). But L is transitive, so this is valid by the 
axiom of extensionality in V. 
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Pairing. Given x, y in L we must show that there is a set z in L whose 
only members are x and y. But if @ is such that x, y © L,, then {x, y}= 
{zEL, lei. 2 =XvzZ=y}€ Def(L.) = La+1CL, so we are done. 

Union. Let x © L. We show that Ux EL. Pick a with x € L,. Since L, 
is transitive, y= Ux CL,. But the formula ¢ (v9) = 3v;(v0 E v1 A 0; E ¥) 
defines y as a subset of L. over L., so y © Def(L..) = Las: € L, as required. 

Infinity. By 2.1(iii) # € L.41C L, so this is immediate. 

Power Set. Let x EL. Set y ={z € P(x)|z € L}, a bona fide set by the 
axioms of power set and comprehension in V. For each z € y, let f(z) be 
the least a such that z © L.. By the axiom of replacement in V, we can find 
an a which exceeds all f(z) for z € y. Thus y CL,. The formula ¢(vo) = 
Vu (v1 € vo <> v, C £) defines y as a subset of L, over L,, so y € Lai CL. 
And clearly, [y = A(x)]*. 

Foundation. Since the membership relation of L is just the restriction to 
L of the usual membership relation, this follows immediately from the 
axiom of foundation in V. 

Comprehension. Let o(vo,..., tn) be a given ¥-formula, x, ai,..., Qn 
given elements of L. We must show that there is y€L such that 
[y ={z €x | e(z,a)}]". Pick @ with x,a © L,. By repeating for the con- 
structible hierarchy the well-known proof of the Reflection Principle, 
we can find B>a such that Wz ELs[p'*(z)¢'(z)]. Let y= 
{z € x | p"(z, a)}. Since x,a © Lp and y C x C Lg, the choice of B ensures 
that the formula (v0) = v9 © X A ~(vo, a) defines y over Lg, so y © Les, 
and we are done. 

Replacement. Suppose g is an &£-formula and a@€EL_ and 
[Vx dye (y,x,a)]*. Let uEL be given. We seek a v EL such that 
[Vx € u dy € v¢e(y, x,a)]. Pick a with u,a € L,. For each x € u, let f(x) 
be the least B = a such that o"(y, x,a) for some y € Lg. Let y exceed all 
f(x), x Eu. Then v =L, is as required. 

The proof is complete. O 


In order to prove that AC and GCH hold in L, we must first examine the 
definition of the constructible hierarchy in some detail. To do this we must 
first give a set-theoretical definition of the language Ly, and investigate the 
basic semantic notions of this language.’ 


' To be precise, in order for the discussion in Section 2 to make sense we needed to know 
that the language y and its syntax and semantics were formalisable within set theory itself, 
but since we only needed to know that this is possible, not how it may be done, we did not 
emphasize this at the time. 
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3. Arithmetisation of Zy 


The following will amount to a formal definition of the language Ly 
within set theory. If s and ¢ are finite sequences, s“¢ denotes the 
concatenation of s and f (i.e. the finite sequence which starts with s and 
ends with f). 

Variables: 


Un = (2, n), n&w. 


That is, v, is the ordered pair (2,n). Let Vbl be the predicate “... is a 
variable’. 
Constants: 


x =@G,x), xEV. 


Let Const be the predicate “... is a constant’. 
Primitive formulas: 


(x € y) =(0)%A4) x)“ y CL), where x, y € VbIU Const. 
(x = y)=(0)"(5)" (x) y)(1), where x, y © VbI U Const. 


Let PFml be the predicate ‘‘... is a primitive formula”’. 
Formulas: 


(ge aw) = (0)°(6)" eb (1) 
(7 ¢) = (0)(7)"¢ (1) 
(Aug) = (0)°(8)"(u) @ (1), where VbI(u). 


(We are here simply giving schemas for the generation of the formulas 
from the primitive formulas, of course.) 

Let Fml be the predicate ‘‘... is a formula’’. If u is a given set, we define 
Const. (x) to be Const(x) (x), u (ie. x is a constant whose standard 
interpretation lies in u), and obtain the relativised predicates PF ml, (x) and 
Fml, (x) in the obvious way. (Thus, if Fml.(x), then x is a formula of &..) 
We regard each of these as predicates of both u and x. 

The predicates introduced above have the following set-theoretical 
definitions. 


VbI(x) < [x is an ordered pair] a [(x)o = 2] a [(x), is a natural number]; 


Const(x)<[x is an ordered pair] a [(x)o = 3]; 
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PFml(x)<>[x is a sequence] a [dom(x) = 5] a [x(0) = 0] 
[x() =4v x(1)=S]a 
[Vbl(x(2)) v Const(x(2))] a 
[Vb1(x (3)) v Const(x (3))] a 
[x(4) = 1]; 
Fml({x)< Jf (Build(x, f)), 
where Build is the predicate: 
Build(g, w#)<[y is a finite sequence] A Waomuy-1 = @ A 
Vi € dom() (PFml(y;) v 3j,k Ei (hi = adi) V 
aj Ei =74)v 
aj € i du Eran(g)(Vbi(u) aw, = Ju y,)). 
The predicates Const,(x), PFml..(x), Fml,(x) have similar definitions. 
We may now give set-theoretical definitions of the basic syntactical and 
semantical notions of Ly. 


Let Fr(¢) be the set of variables occurring free in ¢, if Fml(¢), otherwise 
let Fr(g) =. The function Fr has the set-theoretical definition: 


y =Frig)e 
@[—aFmli(¢)~a y = 9] v Sf ay [Build(¢, &) a (f is a finite 
sequence) a dom(f) = dom() a f(dom(f) — 1) = y a Vi € dom(f) 
[(PFml(¥,) a f(i) = F(ui)) v Ay, k Ei (he = Ade and fi) = 
FUUF(kK)) v aj Ei (h = Ty and fli) = fG))v ajEiaue 
ran(@)(VbI(u) and y, = dug; and f(i)= f(j)— {u})]]. 
where F = Frf PFml. (The function F clearly has a very simple definition, 
which we omit for brevity.) 
If @ is a formula, v is a variable, and t¢ is a constant, g(v/t) denotes the 


formula obtained from ¢ by substituting t for every free occurrence of v in 
gy. We define the function Sub by: 


g(v/t), if Fml(¢) a Vbi(v) a Const(t), 
Sub(g, v, t) = 


g, otherwise. 
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The function Sub has the set -theoretical definition: 
o' = Sub(¢, v, tho 
<[—(Fml(¢) a VbI(v) a Const(t)) a @’ = 8] v [Fml(e) a 
VbI(v) a Const(t) a 3¢ Aw [Build(¢, w) a @ is a finite sequence a 
dom(@) = dom() A 6 aomey-1 = @' A Wi € dom(p) 
[(PFml(¥) 4 6, = SQ, v, Ov Ak Ei (hy, = bj A and 
6 = 061%) Vv Aaj Ei(h =—4; and 6 =—6) vas Ei 
Bu Eran(y)(Vbi(u) and uA v and & = Aug; and 6, = 
4ud;)v aj Ei(y =JAvy, and 6 = ys), 
where S = Sub[(PFml x V x V). 


Let Sat(u, @) be the predicate ‘“‘y is a sentence of Y, such that Fig”. 
This has the following set-theoretical definition: 


Sat(u, gp) 
uA Or Fml.(¢)a df do [(f is a finite sequence) 
ale € fdom(f)- 1} a0 Cr = 
[fv =v, ij Eo} Ula € yi, j Eo} 
U{v = £|ie€wonx Eup UE =v ]/i€ wax Eu} 
U{y E fie wnx EusU{s Ev /i€onx Eu} 

x,y €u}]] 

AWg[(g is a finite sequence) , (dom(g) = dom(f)) 
A(o C dom(g)C 7) a Vi € (dom(g) — I) {g(i+ 1) = 
BI) ULP Ad | hw’ gL TH] 4 Ee gH} 
U{Aony|m Ew ad & gi 

[f(0)C{¥ Ey 

Vi € (dom(f)— DFE + NC FOU A d'|d, WE FOU 

{4 [WE Bli)-fD}U{Fmy |mEwonwE gli) 

Ax € u (Sub(y, vn, X) & f(i))})])- 


The function Def thus has the set-theoretical definition: 


U{sE¥| xy Eup ULR=¥ 


xy GunxEysUl{s=*£]x Eula 
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v = Def(u)o 
Vx € v 3g [Fml.(¢) a Fr(g) = {vp ax ={z Eu | 
Sat(u, Sub(g, vo, Z))}] a We [Fml..(¢) a Fr(e) = {v0} > 
dx Ev(x ={zEu | Sat(u, Sub(¢, vo, Z))})]. 


4. The structure of the constructible hierarchy 


Most of the fundamental lemmas concerning the constructible hierarchy 
are consequences of one result, to be proved in this section. First, we must 
remind the reader of some standard notions from logic. 

A formula g of &y is said to be Yo if it contains no unbounded 
quantifiers. (See 1.2 in Chapter B.4, except that there they are called Ao 
formulas.) It is 2, (resp. II,,) (for n = 1) if it is of the form 3x, Vx, 3x3--- 
— x, (resp. Vx, dx.Vx,°::— x,t) where & is Xp. 

A class U is 2%" if it is defined (over ZF) by a =, formula of &. Similarly 
T1Z*. A class is AZ" if it is both 2" and ITZ". 

Let M be a given class, N C M. A predicate (finitary) P on M is 2*(N) if 
it is defined (over M) by a 2, formula of Ly. Similarly IM(N) and AX(N). 
We write =” in place of =*(6) and =,, (M) in place of (M), etc. (Thus, in 
a certain sense, =2* corresponds to “y”’.) 

These notions have a close connection with the notion of absoluteness, 
an important concept in constructibility theory. 

A predicate P is said to be absolute for a class M if for all x E M, 
P(x) <> P™ (x) (i.e. if P is the same whether interpreted in M or in V). 

If M is transitive, any 2) formula of Zy, will be absolute for M, as is 
easily seen. (Proof by induction on formulas.) It follows immediately that if 
P is a Af" predicate and if M is a transitive model of ZF, or at least of a 
fragment of ZF which is sufficiently strong to prove the equivalence of the 
=, and II, definitions of P, then P is absolute for M. 

Most of the basic concepts of set theory are 2", and in fact 23‘ for any 
transitive set M. In particular, are: x =y, x Ey, xCy, y={x,z}, y= 
(x,z), y=(x), y=(x), y= Ux, y= 1x, x is an ordered pair, y = 
x Xz, y=x-—z,x isa function, y = dom(x), y = ran(x), y = x(z), On(x), 
lim(x), succ(x), x is a natural number, x is a sequence, x is a finite 
sequence, x = w, w € x. In fact, each of these predicates is uniformly 25‘ 
for transitive M, which is to say the Zo formula involved in each case is 
independent of the actual choice of M. 
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Notice also that if P(x) is =3* or else 20‘, where M is transitive, then 
P((x)o) and P((x),) are also 2», whence so also are Vx € dom(u) (P(x)) and 
Vx Eran(u)(P(x)) (for u a finite sequence, say). 

One final comment which we should perhaps make here is that if M 
is transitive and closed under the formation of ordered pairs, then any 
xN predicate Q on M is expressible in the form 3x, Vx24x3°°- 
—x,R (x1,...,%X,), where R is =$*. (Contraction of quantifiers.) 

A function is called 2, just in case its graph is 2, We now have the 
necessary prerequisites for: 


4.1. Lemma. The function (L, | v € On) is 27* and uniformly Zi for limit 
A>. 
Proor. We clearly have: 
y=Lio 
oof Ei is a function a dom(f)=a+1af(0)=@a 


Vy Edom(f) |(tim(y) ay >0- fly) = U (6) a 
(suce(y) > f(y) = Det(f(y- 1))] ay = fla)]. 


We must show that this is =, by a definition which works both for ZF and 
arbitrary limit L, above w. Well, by checking through the definitions given 
in Section 3, it is easily seen that the predicates VbI(x), Const, (x), 
PFml,(x) are Zo, and that in all the remaining cases, any unbounded 
quantifiers may, in fact, be bound here by one of: 

wo, P..(Seq(9 VU VbIU Consturanyy)), 


Seq(P-.. (Seq(9 U VbI U Consturanyy))), 
Seq(Seq(9 U VbI U Consturanyy)), 
Seq(9 U VbIU Consturany), Seq(P<«(VbI)), 


where Seq(x) = the set of all finite sequences from x, P-.,(x) = the set of 
all finite subsets of x. Hence, if we define the function w by 


w(f) = @ U Pz. (Seq(9 U VbIU Consturanyy)) 
U Seq(P<. (Seq(9 U VbIU Consturenn))) 
U Seq(Seq(9 U VbIU Consturanyy)) 
U Seq(9 U Vb1U Consturanyy) U Seq( P<. (VbI)), 
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we obtain a definition of the form 
y=L, @3f aw [w = w(f)a UW, fa@)], 


where U(w, f, a) is the 2» predicate obtained from the original definition of 
the function by deleting the initial existential quantifier and binding all 
remaining quantifiers which are not already bound, by w. To show that this 
is >|, we must show that the function w(f) is 2,. This reduces to showing 
that the functions Seq(x) and Y..(x) are %,. And since y= 
P(x) <> Az [z = Seq(x) a y = {ran(u)|u € z}], this in turn reduces to 
showing that the function Seq(x) is 5,. Well, 
y = Seq(x) > 
«dg [g is a sequence a dom(g)=way = U ran(g)a g(0)=OA 
Vn €wVs Eg(nt+1)atE g(n)da€ x (s =t Uf{(a,n)}ya 
Vn EwVWs €g(n)VWa Ex 3t€ g(n+1)(t=s5 U{(a, n)}). 


This is clearly 2,. Moreover, this definition works for any limit L, above w; 
for if xEL,, then Seq(x)CL.+2, so by the definition of L..3; from 
L.+2, Seq(x) € L.+3, and the unique g satisfying the above condition lies in 
La+s. 

It remains only to show that our final definition of the hierarchy function 
is suitable for any limit L, above w. Well, there is clearly only one 
possibility for the function f, namely (L, | vy = a). And by an easy induction 
on @ we can show that a <A implies (L.{|v=a)E€L,. The lemma is 
proved. O 


4.2. COROLLARY. The function (L, | v € On) is Af* and uniformly At» for 
limit A > w. 


ProoF. In general, any 2, function with a I], domain is A,. To see this, 
consider the equivalence: y =f(x)<x €dom(f)a Wz [z = f(x)> 
x=y]. O 


5. The axiom of constructibility 


This is the assertion that all sets are constructible, and is usually written 
as V=L. By 4.2, we have at once: 


5.1. LemMA. (i) The class L is D7*. 
(ii) V=L is a II, formula of &. 
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Proor. x €&Leda[xEL.], so L is 2%. And V=LeVxda 
(«xE€L.). O 


4.2 also gives us (by the absoluteness of A; properties): 


5.2. LEMMA. (i) If M is a transitive model of ZF, then for alla € M,L. EM 
and (L.)” = L,. Hence (L)“ = L, ifA = OnNM Mand (L)“ =Lif OnCM. 
(ii) For all a, (L.)"=L.. Hence (L)'=L. 
(iti) If lim(A) and A >, then (LY =Ly. 


By part (ii) of the above, we have at once: 


5.3. THEOREM (Gédel). ZF (V =L)’. 


It should be pointed out that the use of the word ‘‘axiom”’ here is not 
intended to convey the idea that V=L is a ‘“‘reasonable”’ or “‘intuitive”’ 
addition to the axioms of ZFC as a basis for set theory. It is just that V = L 
is an interesting additional assumption, worthy of study both in its own 
right and because it resolves many of the questions known to be undecid- 
able on the basis of ZFC alone. (See our introductory remarks.) Moreover, 
any result proved in the system ZFC + V = L will automatically be shown 
to be relatively consistent to ZF. (Because the construction of the 
constructible universe took place in ZF.) 


6. The axiom of choice in L 


We show that the axiom of choice is valid in L in a very strong way, by 
giving a formula which defines, over L, a well-ordering of L. The idea 
behind our definition is to fix a specific ‘order of construction”’ for the 
elements of L. In order to do this, it is convenient to ‘‘normalise’”’ the 
definition of the constructible hierarchy slightly. 


6.1. LEMMA. For each a, there is an L,-definable function (-,-)* on L, such 
that: 

(i) L. is closed under ({-,-)*. 

(ii) For all x, y,x',y’EL., (x, y)* =(x, y ex =x'’any=y’. 


ProorF. If lim(@), then the usual pairing function (-,-) suffices, of course. 
For all other cases, we define (-,-)* by recursion on a. Suppose {-,-)* has 
been defined. Thus (-,-)* © Lai:. So for x,y € La+1, we can define, in 
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Lai, XX**'y = {(u, v)* | uExavey}. We can then define, in Las, 
(x, yy" = (x Karl {0}) U (y att {1}). O 


We call (-,-)* the a-pairing function. Using it, we have: 


6.2. LemMMA. If x €Las:, then for some £-formula p(vo, v1) and some 
pé&L.,, x ={z EL. lE., 9 (2, B)}. 


Proor. By ‘contraction of parameters”’ using the a-pairing function (and 
its L,-definable inverses). O 


6.3. Lemma. There is a sequence (<, | a €On) such that; 
(i) (<<. |a@ € On) is A?* and uniformly Ay» for limit A >; 
(ii) for limit A >, <, is a (uniformly) Ai» well-ordering of Li; 
(iii) if a < B, then <g, is an end-extension of <.; 
(iv) ifx Ey EL, then x <.y; 
(v) for limit A > w, the function pr(x) = {y | y <, x} is (uniformly) Ay. 


Proor. We define <, by recursion on a. Set <o = 9. And if lim(A), A > 0, 
set <, = U,<,<.. Suppose now that <, has been defined. Let (¢, |n < 
w) be some recursive enumeration of the formulas of & with the free 
variables vo and v;. (We regard this enumeration as fixed from now on.) 
For x € Lusi, set: 


n(x) = the least n such that for some p € L,, x ={z EL. | Fr. gn (Z, B)}, 
p(x) =the <,-least such p. 


Then, for x, y © Las, we set: 


X<anyo[xyeELax<ay]v 
[xEL.anyéLJv 
[x, yZL. an(x)<n(y)]v 
[x yEL. a n(x) = n(y) A p(x)<ap(y)]- 


Clearly, each <, is a well-ordering of L.. Since <4, is always an 
end-extension of <., condition (iii) is immediate. Condition (iv) is also 
immediate. The proof of (i) is very similar to the proof of 4.1 (and 4.2), so 
we omit the tedious details. For (ii), we have the 2, definition 
x<,y da <A(x<.,y), and hence also the II, definition x <,y ox 
y Ada <A(y<.y). And for (v), we have the 2, definition 
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z=pr(x)eav<A[xELiaz ={y ly <.x}, 


so as in 4.2, this function is A,. O 


We set <p = U.con<a. Thus <; is a 27* well-ordering of L such that 
<L_N(L, X L,) = <, for all limit A. Letting g be the obvious 2, £-formula 
which defines <,, and # the II,-formula vo # v; A 7 e (v1, Vo), we thus have: 


6.4. THEOREM (Gédel; Jensen-Karp). There are £-formulas ¢ and , 2, 
and II,, respectively, such that: 
(i) g and w are absolute for L; 
(ii) ZF F “{(x, y) | p(x, y)} well-orders L”’; 
(iii) ZF + V=Lt[oe(x, yhouw(x y)]. 
Hence, in particular, ZF+ (AC)”. 


7. The condensation lemma. The GCH in L 


We write X <s, L. iff X CL, and for every 2, sentence ¢ of Lx, Fx 
iff Fi, ¢. 

Clearly, if X CL, is transitive, then X<s,L.. And for lim(qa), 
n>0, X<s, L. iff for all PE AP(LL)NZh(X), PAGD>PNXH OD. 


7.1. LEMMA (Condensation Lemma). Suppose lim(a). Let X <s,L.. Then 
there are unique B, 7m such that: 
(i) 7:(X, €)= (Le, ©); 
(ii) if Y CX is transitive, then 7 {| Y =id[ Y; 
(iii) w(x) S-x for all x © X. 


Proor. If a = w, then we must have X = L., so there is nothing further to 
prove. So we may assume that a > w. Since X <s,L. and L, is transitive, 
X is well-founded and extensional. Hence by the well-known Mostowski 
Collapsing Lemma, there is a unique transitive set M and a unique 
isomorphism 7 : X = M. (This is discussed in 3.7 of Chapter A.7.) We show 
that M =L, for some (unique) B. 

By 4.2, there is a 2» formula ¢(vo, v1, v2) such that: 

(a) v=L, oF Az e(z, 2, y); 

(b) a Vy Av 3z ¢(z, v, y); 

(c) Fi, Wx Ay Av Az [e(z, 0, y)ax Ev]. 
Since 77': M <s,L.,, (b) and (c) give: 

(d) Vy © M[Fm Jv Az e(z, v, y)]; 

(e) Vx © M[Em Jy Av Az [¢(z, v, y) Ax E v]}. 
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Since 2» statements are absolute for transitive sets, (d) and (e) give: 

(f) VyEMAvEMAZEM[F Q(z, 2, y)]; 

(g) Vx EMAyEMAVEMAZEM [KF oe(z,v,y)ax Ev]. 
Using (a), (f) and (g) give: 

(h) Vy ©M(L, EM); 

Gi) Vx EMAyEM(xEL,). 
Since M is transitive, (h) gives U ew L,C M. And by (i), MC U,emL,. 
Hence M = U,-uL,. But as M is transitive, MM On= 8, some ordinal B. 
And as 77': M <z,L, and lim(a), we clearly have lim(B). Hence M = Lg. 

Part (i) of the lemma is thus proved. And part (ii) holds by the definition 
of the map 7. For part (iii), suppose that 7(x)> x for some x € X. Let xo 
be the least such x. Since 7: X =Lg and Lg is “‘closed’’ under <, and 
Xo<_7(Xo), XoELs Hence for some x,€X, x= 7(x,). Thus 
(xX1)<z (Xo). So, as <, is uniformly 2} for limit A > @, x1<1 xo. Thus by 
choice of Xo, (x1) = x1, contrary to m(x1)=x0. O 


Using 7.1, we can prove that the GCH holds in L. First a lemma. 


7.2. LEMMA. Let x be a cardinal in L. If x € L is a bounded subset of x, or if 
x CL, for some a <x, then x EL,. 


Proor. Pick a <x with x C L,. Pick A such that lim(A) and x € L,. Let M 
be the set of all elements of L, which are L,-definable from elements of 
L, U{x}. Since <, is L,-definable, it is easily seen that M < L,. So by 7.1, 
let 7:M=L,. Notice that m[L. =id[L., so in particular m(x)= x. 
(Recall that w(x) ={m(z)| z © x MM}.) Hence x EL,. Now, as our lan- 
guage is countable, |M |’ =|L, |" + #. And by an easy induction argument, 
|L.|*=|v|* for all infinite v. Hence | y|“=|a|*+ w < «x. Hence y <x, and 
xEL,. O 


7.3. THEOREM (Gédel). ZF} (GCH)’. 
Proor. By 7.2, for all infinite x, 2“ <|L,+|. But |L,-|=«*. Hence 
2°=x*. O 
8. 2, Skolem functions 
An important concept in constructibility theory is that of a 2, Skolem 


function. 
A 2, Skolem function for L. is a 2t- function h such that dom(h)C 
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w X L,, ran(h)C L,, and whenever P € re({x})N A(L..) for some x € L.,, 
then Ay P(y)—> di € w P(h(i, x)). 


8.1. Lemma. Let lim(a), and let h be a 3, Skolem function for L.. If 
xEL,, then x € h"(w x {x})<sz, La. 


Proor. Set N =h"(w x {x}). Clearly, x EN. Let PE Zr-(N)N A(L.). 
We show that P#®0->PNN#Q. Pick yi,...,yn EN with PE 
Sto({y1,...;¥m}). By definition of N there are ji,...,jmn€w with 
yi=h(ji,X),.--,¥m =h(jm,X). Since h is Xi-, it follows that PE 
Ste({x}). Hence P#@—>3yP(y)—> di € wP(h(i,x)) > Jy € NP(y)> 
PON#%. O 


8.1 may be generalised in several ways. For instance: 


8.2. Lemma. Let lim(a), and let h be a 2, Skolem function for L.. If p © L. 
and if XCL, is closed under ordered pairs, then X U{p}C 
h"(w x (X x {p})) <z, La. 


Proor. Set N=h"(w X(X X{p})). Clearly, XU{p}CN. Let PE 
Ti(N)NP(L.), P=@. Pick yi,..-,¥m EN with PE Zie({y1,..., ¥m})- 
Pick j:,...5jm€@@ and X1,...,%m EX with y:=h(fi,(%1,p)),---5 Ym = 
h(jms (Xm p)). Let x = (x1,-.., Xm). By assumption on X, x © X. And as h is 
The, P is Dh-({(x, p)}). So, P¥ O— Ay P(y)—> Ji € w P(h(i, (x, p))) > Ay © 
NP(y) > PON#Q, and the lemma is proved. ( 


Similarly, we have: 


8.3. Lemma. Let lim(a), and let h be a %, Skolem function for L.. If 
XCL. and if h"(w XX) is closed under ordered pairs, then XC 
h"(w x X)<z, Le 


Proor. Clearly, if Y=h"(w x X), then h"(w xX Y)=h"(w x X), so the 
result follows from 8.2. 


The above two lemmas give the forms in which 2, Skolem functions are 
often used. As to the existence of ©, Skolem functions, we show below that 
for each limit ordinal a > w, L, has a >; Skolem function. We require some 
preliminaries. 


8.4. Lemma. Let lim(a), a > w. Then Fy,” (the restriction of Fi, to the Xo 
sentences of L.,) is (uniformly) Xr. 
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Proor. For any 2» sentence ¢ of L£,,, we have, by absoluteness, F., ¢ iff 
Frc(raney , Where TC(x) denotes the transitive closure of x. Hence, with 
Sat(u, g) the predicate defined in Section 3, we have, for 2» g, Fi¢ iff 
Sat(TC(ran(¢ )), ¢). Now, by examining the definition of Sat, we see easily 
that in the relation Sat(u, ¢), any unbounded quantifiers may be bound by 
a set of the form w = w(u), where w is a 2, function similar to the one 
used in 4.1. Hence Sat(u, @) has the 2{- definition 


Sat(u, 6) <> aw [w = w(u) A S(w, u, $)), 


where S(w, u, ¢) is the 2» formula obtained from the definition of Sat given 
in Section 3, by binding all unbounded quantifiers by w. But the function 
TC is also Li, having the definition 


w = TC(x)<> Sf [f is a function a dom(f) = w a f(0)= 
xaWn€ w(f(nt+1)= Uf(n))aw = Uran(f)}. 


Hence F,.” is Ty. O 


8.5. CoroLLary. Let lim(a), a > w. Then -,,* (the restriction of 1, to the 
sentences of L£,, of the form Avo g (vo), where vp is Xo) is (uniformly) Tre. 


Proor. Let Sub be the function as defined in Section 3. Arguing much as 
we did for Sat in 8.4, we see that Sub(g, v, t) is Dt-. But if @ is Lo and 
y = Avo ¢, then 


Fig iff AYEL, ax EL, [— = Av W& 1, sup(y, vo, £)], 
so we are done. 0) 


8.6. Lemma. Let lim(a), a > w. Then L, has a (uniformly Lye) 2, Skolem 
function. 


Proor. Let (g; |i << w) bea (fixed, carronical) recursive enumeration of all 
formulas of £ of the form 9; = Jvo¢; (vo, v1, V2), where ¢; iS Xo. Define a 
(partial) function r, by: 


W = ra (i, X) Fi, Gi ((W)o, (W):, ¥) WZ <i 7 Gi ((z)o, (2), ¥) 
Fi, Gi((W)o, (W)1, X) a du [u ={z | z< W}a 
VzEu 1 Gi((Z)o, (z):,¥)]. 


By 8.5 (together with 6.3), r, is Di-. Hence h, is Die, where we set: 
h(i, ¥) = (ra (i, X)):. It is easily seen that A, is as required. C) 
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We call the function h, defined above the canonical 2, Skolem function 
for L,. Let H; be the 2 formula of Y, implicitly given by the above 
proof, such that for all limit @>wo, and all x,yEL., y=h.(i,x) 
o4zeEL. [Fi Ai (Zz y,¥)J. In particular, notice that the sequence 
(H; |i <) will be recursive, so (by 8.5) H, will be (uniformly) D!= (for 
limit a >), where H, (z, y, i, x)<E1, H; (Z, y,¥). We fix this notation 
from now on. 

As an example of the use of 2, Skolem functions, we prove: 


8.7. LemMMA. Let lim(a), a > w. Then there isa 2,(L.) map of a onto L.,. 


Proor. By means of a technical, though not unduly difficult argument, we 
may show that there is a 2,(L.) map of a onto a X a. So, let pE L, be 
such that there is a 2'-({p}) map of a onto a X a, p the <,-least such, and 
let f be a Di«({p}) map of aontoa X a. Define f, f' by fv) = (f°), f'@)), 
all »y Ga. By recursion, define f,:a—> a" by: f,=idla; fr(v)= 
(f°(v), fe of'(v)). Thus each f, in Ste({p}). Let h =h,, and set X = 
h"(w x (a x {p})). 

Claim 1. X is closed under ordered pairs. 

To see this, let x,,x.2E X. Pick fi,j,E w and 4,m~%€a with x,= 
h(i, (v1, p)), X2= A(j2, (v2, p)). Let (v1, 2) = fo(r). Then {(x:, x2)} is a 
Lie({(z, p)}) predicate. So, by definition of h, (x,, x2) © X, which proves the 
claim. 

By the claim and 8.3, X<s,L.. By the condensation lemma, let 
a: X =Lg,. Since a C X, we clearly have B = a here. 

Claim 2. For all i€ @ and all x © X, 7(h(i,x)) = h(i, 7(x)). 

To see this, suppose first that i € w, x € X, and y = h(i, x). Since h is Dy« 
and x€X<;s,L., y€EX. And as y=h(i,x), &., 3zHi(z,¥,%), so 
Fx 3zH,(z,¥,%), so for some zExX, FxH(Z,y,x). Applying 2, 
Fi, Hi(a(z)°, r(y)°, w(x)). Hence Fr, 32 H(z, 7(y)°, 7(x)°). Thus 
a(y) = h(i, 7(x)). Conversely, if h(i, 7(x)) is defined, then we must have 
h(i, 7(x)) = 7(y) for some y, and we can reverse the above steps to obtain 
y = h(i,x). This proves the claim. 

Now, fC a Xa Xa and t/a = id! a, so w"f = f. And by isomorphism, 
af is S¥-({a2(p)}). So by choice of p, p<. 7(p). But by the condensation 
lemma, 7(p)=.p. Hence z(p)= p. So by claim 2, if i€ w and v<a, 
a(h(i, (vy, p))) = h(i, (y, p)). Hence wf X = id X. Hence X = Ly. 

Let S be a 25- predicate such that 


y=h(i,x)oaz EL. S(z, y,i,x). 
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Define g:aXaXa—>L, by 
y ifyeEL, & 4z EL, S(z,y,i,(v, p)), 

glint) =F 

@ otherwise. 

Then g has the 2,(L.) definition 

y=gliivur)oly EL, & 3az EL, S(z,y,i,(yp))]v 

[Vy’'EL, Vz EL, 4S(z,y’,i,(y,p)) ay = 8). 


And clearly, g"(@ X a X a) = h"(w X (a X {p})) = X = L,. Hence g °f; is as 
required. [J 


9. Admissible ordinals 


An admissible set is, it will be recalled, a transitive set M which satisfies 
the following conditions: 
(Il) M#9, 
(Il) Vx EM(Ux eM), 
(III) Vx, y € M ({x, y}E M), 
(IV) Vue M(PNuEM), where PEX(M)NPA(M) (2-Compre- 
hension), 
(V) Vx €E May € MP(y,x) > Wu EG MAvE MVx Eu ay € v Ply, x), 
where P€X,(M)€ P(M) (2o-Replacement, or, more accurately, 2o- 


Collection). 

Moreover, if M is an admissible set, we may show that in fact (IV) and 
(V) hold for the case where P is A,(M) and 2,(M), respectively, thereby 
obtaining the principles of A,;-Comprehension and 2,-Replacement. See 
Section 2 of Chapter A.7 on admissible sets. 

An ordinal a is admissible just in case L. is an admissible set. (It is not 
hard to show that @ will be admissible iff there is an admissible set M with 
a = M1MOn, so the dependence of our definition upon the constructible 
hierarchy here is only apparent.) 

It is easily seen that if lim(a), then L. satisfies (I}-(IV) of the definition 
of admissibility. Hence: 


9.1. Lemma. Let lim(a). Then a is admissible iff for each Xo(L.) predicate 
P(y,x) on L,, 


VxEL, dy EL. P(y,x) > Vu EL, dv EL, Vx Eu Ay € vP(y, x). 


Using 9.1, we have: 
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9.2. LEMMA. Let lim(a). Then a is admissible iff there is no 2,(L.) 
mapping from an ordinal & < @ cofinally into a. 


REMARK. Suppose a is admissible. Let f: 5 —> a@ be 2, (L.), where 5 < a. 
Then Vé € 6 3¢ (fg = f(é)). So by 2,-Replacement there is y Ga with 
VEéEb ACE y (¢ = f(é)). Hence f’65 C y, and f is not cofinal. 

Conversely, suppose there is no 2,(L.) mapping from an ordinal 6 <a 
cofinally into a. Using 9.1, we show that L, is admissible. Let P(y, x) be a 
2Xo(L.) predicate such that Vx EL, dy EL, P(y, x). Let u EL, be given. 
We seek a v EL, such that Vx € u dy € vP(y, x). Define a 2,(L.) map 
f:u—a by f(x) = the least y such that dy € L, P(y, x). Let 6 be the least 
ordinal such that u CL,. Extend f trivially to Ls. By our assumption, 
together with 8.7, f cannot be cofinal in a. So there is a p<a with 
f’Ls Cp. Hence v = L, is as sought. 0 


Our next result strengthens 9.2 considerably. 


9.3. LEMMA. Let lim(a@). Then @ is admissible iff there is no X,(L.) map 
from an ordinal 5 < @ onto a. 


Proor. (—) This part follows directly from 9.2. 


(<) Assume that @ is not admissible. We show that there is a 2, (L.) 
map from an ordinal 6 <a@ onto a. Now, by 9.2 we know that there is a 
5<a and a 2,(L.) map f of 5 cofinally into a. Let f be Li({p}). If 
a=y+wm for some y, we are clearly done. Hence we may assume 
otherwise, and pick a limit ordinal y<a with 6,pEL,. Let h=h,. 
Set X = h"(w X L,). Since L, is closed under ordered pairs, X <z, L.. Let 
aw:X =Lg. Since 7 [L, =id[L,, an argument as in the proof of claim 2 of 
8.7 tells us that a[{X =id[ X. Hence X =Lg,. Now, f is 2r=({p}) and 
p © X <;z,L., so X is closed under f. But 6 C X and f is cofinal in a. Hence 
a CX. Hence B =a@ and X = Lg. 

Let S be a Xo= predicate such that 

y=h(i,x)e43z EL. S(z, y,i, x). 
Define g:w xX 65XL,—>L, by 


y ify EL, & Az EL) S(z, y,i, x), 
g(i,¥,x) = 
@ otherwise. 


As in the proof of 8.7, we see that g is 2i-({p}). Moreover, g"(w X 5 X 
L,)=h"(w X L,) = X = L,. But by 8.7 there is a 2,(L,) map j : y —> w x 
6XL,. Hence goj is a 2,(L.) map of y ontoL.. O 
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Chapter C.5 discusses recursion theory on admissible ordinals and its 
interaction with constructibility. 


10. The Souslin hypothesis 


Some of the best illustrations of the methods of constructibility theory 
are provided by the Souslin Hypothesis and its generalisations. 

The Souslin Hypothesis, SH, is the assertion that whenever X is a dense 
linear ordering with no end-points, complete under the formation of sups 
and infs, and with the property that any collection of pairwise disjoint open 
intervals is countable, then X is isomorphic to the real line. (The real line 
clearly possesses all of the properties just stated.) 

An old theorem of Miller characterises SH in terms of trees (see 
Theorem 4.11 of Chapter B.3). A tree is a poset T=(T, =) such that 
% ={y © Tl y <x} is well-ordered by < for all x € T. The order-type of £ 
under < is called the height of x in T, ht(x). The @-th level of T is the set 
T, = {x € T |ht(x) = a}. We set Tha = Ug. Ts. 

Let @ be a limit ordinal, A a cardinal. T is a (0, A)-tree iff: 

(i) Va < 0(0<|T.|<A)A| To] =1A Te =O, 
(ii) Va < OWx E Te (\{y © Tari x < y}| = 2), 

(iii) Va < B < OVx & T, Ay € Ta (x < y), 

(iv) Va < 0Wx,y € T, (lim(a)—> (x = yo £ = f)). 
If « is a cardinal, a x-tree is a («, x)-tree. 

Let T be a tree. A branch of T is a maximal linearly ordered subset of T. 
An antichain of T is a pairwise incomparable subset of T. A branch of T is 
an a-branch if @ is its order type. An Aronszajn tree is an w,-tree with no 
w,-branch. A classical theorem of Aronszajn is that such a tree exists (in 
ZFC). A Souslin tree is an w,-tree with no uncountable antichain. It is 
easily seen that any Souslin tree is Aronszajn. The converse is not 
provable. Indeed, in ZFC it is not decidable whether or not a Souslin tree 
exists. (See DEVLIN and JOHNSBRATEN [1974] for details.) Miller’s theorem is 
that SH is equivalent to the non-existence of a Souslin tree. The easy proof 
may be found in DEvLin and JoHNsBRATEN [1974]. Hence SH is undecidable 
in ZFC. It is, however, decidable if we assume V = L. 


10.1. THEOREM (Jensen). Assume V=L. Then SH. 


Proor. We construct a Souslin tree T by recursion on the levels. The 
elements of T will be countable sequences of 0’s and 1’s, and the tree 
ordering will be ordinary inclusion. Moreover, T will be such that 
sCt€ T—s ET, whence s € T, > s € 2*. Tocommence, we set To = {9}. 
If T.. is defined, we set 
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Turi = {8°(0)| s € To} U{s(1)] 5 © Ty}. 


Finally, suppose lim(a) and that T | @ is defined. In order to define T, we 
make use of the function f : w; > w, defined by f(é) = the least y such that 
F., ‘“‘é is countable’. (To see that f is well-defined, use 7.2.) For each 
x€Tta, let s, = U b,, where b, is the <,-least a-branch of T | a@ which 
contains x and which meets every maximal antichain of T | @ lying in Ly). 
Since cf(a) = w and |Lya)| = w, b, is always defined, whence so is s,. And 
5, € 2°, of course. Let T, = {s, |x € Tt a}. This defines T = U,.., To. 

We show that T is a Souslin tree. Clearly, T is an w,-tree. So, if T were 
not Souslin, we could find an uncountable maximal antichain of T. Suppose 
then that A is the <,-least uncountable maximal antichain of T. Now, by 
7.2, TEL. And clearly, T is L.,-definable. Hence A EL.,, is also 
L.,,-definable. So, if M is the smallest M <_L., (i.e. M is the set of all 
L.,,-definable elements of L...), T, A € M. 

Claim: MM @,; € @. 

Since M is clearly countable, to prove the claim, it suffices to show that 
M 1 , is transitive. Let y € MN a. Since y € w, F1., “y is countable”. 
Let j be the <,-least map j:w——> y. Since yE€M<L., and j is 
L.,-definable from y, j € M. But w C M. Hence y = j”w C M, and we are 
done. 

By the claim, set a = MM o,. Let 7:M=L,. Clearly, w[L, =id]L., 
m(a)= a, 7(T)=Tla, 7(A)=ANa=AN(TIa@). Also, since Fi, “A 
is a maximal antichain of T’’, F:, “‘A QM a@ is a maximal antichain of T[ a’, 
so A Qa really is a maximal antichain of T[ a. And of course A Na €L,. 
But look, a = wi and a@ is countable in Ly), so we must have A < f(a). 
Hence A Na ELya). Hence, by definition of T,, every element of T, lies 
above an element of A Ma. Hence A Ma is a maximal antichain in T. 
Hence A =A Na. But this is absurd, since A is uncountable. Hence T 
must be a Souslin tree, and we are done. (J 


For a direct construction of a Souslin line see Theorem 3.8 of Chapter 
B.3. 


11. The generalised Souslin hypothesis 


The notion of a Souslin tree clearly generalises from w, to any regular 
cardinal x. Thus, a «-Souslin tree is a x-tree with no antichain of 
cardinality x. (And any such tree will be a k-Aronszajn tree, as in the case 
K = w,.) We denote by SH(x) the assertion that there is no «-Souslin tree. 
By examining the proof of 10.1, we shall prove that if V = L, then SH(x) 
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fails for all successor cardinals. Later on we shall consider the case of 
inaccessible cardinals. 

Now, when one tries to generalise the proof of 10.1 to the case of an 
arbitrary successor cardinal, one runs into the problem that one must 
extend branches at limit levels of uncountable cofinality. And this is not 
always easy to do. For instance, in attempting to construct an w2-Souslin 
tree T, already T | w, might be an Aronszajn tree, in which case one can go 
no further. In order to avoid such difficulties, one sets up (in L) some 
combinatorial machinery, and then uses this machinery to construct the 
Souslin tree. Unfortunately, due to the limitations of space, we cannot give 
any prior motivation for the machinery itself, but must rely upon its 
application to the construction of Souslin trees to provide such motivation. 

Let « be any cardinal, possibly singular. Let E C x*. By O, (E) we mean 
the following assertion: There is a sequence (C, | A <«* a lim(A)) such 
that: 

(i) C, is a closed unbounded subset of A; 

(ii) cf(A)<K |G |<k; 

(iii) if y is a limit point of C,, then y@ E and C, =yNG,. 

(Notice that by (ii) and (iii) in the above, cf(A) = x > otp(C,) = , where 
otp means order type.) 

We denote O, (9) by O,. In Section 13 we shall prove the following result: 


11.1. THEOREM (Jensen). Assume V = L. Let x be any cardinal. Then there 
is a stationary set EC x* such that aG€ E->cf(a)=w and O, (E) 
holds. © 


As well as O,, we also require the following combinatorial principle. As 
for O,, this principle is also due to Jensen. 

Let x be a regular cardinal, E C x. By ©, (E) we mean the assertion that 
there is a sequence (S, | a € E) such that S, C @ and for every X Cx, the 
set {a € E| XN a = S,} is stationary in x. 

Of course, for ©, (E) to hold, it is necessary that E itself be stationary in 
«x. Assuming V = L, this condition is also sufficient here. 


11.2. THEOREM (Jensen). Assume V = L. Let « be any uncountable regular 
cardinal, and let E be any stationary subset of x. Then ©, (E) holds. 


Proor. By recursion on a@ € E, define (S,, C,) as the <.-least pair such 
that S., C. C a, C, is closed and unbounded in a, and yEC,N E> SN 
y# S,; with S, = C, =@ if either succ(a) or else lim(a@) but no such pair 
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exists. We claim that (S, |a@ € E) is as required. Well, suppose not. Let 
(S, C) be the <,-least pair such that S$, C Cx, C is closed and unbounded 
in x, anda€@ CNE—>SNaFZ5S,. Since (S, | a € E) is clearly definable 
from E in L,+, so is (S, C). Define a sequence (N, | v <«) of elementary 
submodels of L,+ as follows: 

No = the smallest N < L,+ such that NO « €On and EEN. 

N,i1 = the smallest N < L,+ such that NO « € OnandN, U{N,}CN. 

N, = UL..N,, if lim(). 
Set a, =N,M«, each v. Clearly, (a, |v <x) is a strictly increasing, 
continuous sequence, cofinal in «. So we can pick a v such that 
a=vEENC. Let w:N.=L,g. Then w[v=idly, w(k)=», and 
a((S,C))=(SNv,CNv), and r((S, la € E))=(S.laEGENv). And 
since 7 ':Le<L,-, (SN v,CNv) is the <,-least pair such that S/N », 
CNvCyv, CN vp isclosed and unbounded in », andy EC CN YN E>SN 
vO y#S,. Hence (SN »,CN v) ={S,, C.). In particular, SN v = S,. But 
vECNE, so this contradicts the choice of (S,C). The proof is 
complete. OD 


We are now in a position to show that SH(«) fails for all successor 
cardinals « in L. 


11.3. THEOREM (Jensen). Let « be any cardinal. Suppose that there is a 
stationary set E C «* such that O, (E) and ©,+(E) hold. Then ~SH(«"*). 
Hence (by 11.1 and 11.2) if V=L, then Wx [7SH(x*)]. 


Proor. Let (C, | A <«* A lim(A)) satisfy 0. (E) and let (S. | a € E) satisfy 
©,*(E). We construct a «x *-Souslin tree, T, by recursion on the levels, so 
that for each limit ordinal a <«*, T/q@ is an (a,|a|*)-tree. The elements 
of T will be members of «, and we shall have a<rB—-a<fB. We set 
To = {0}. If T.. is defined, T,., is obtained by appointing two new ordinals 
as extensions of each member of T,. Suppose finally that lim(a) and T[ a is 
defined. For each element x of T [| @ we attempt to define an a-branch b; 
of T/ a, containing x, as follows. 

Let x € T/ @ be given. Let (y, | v < A) be the monotone enumeration of 
C,. Let v(x) be the least » such that x€ Tl y,.. Define a sequence 
(p>| 0(x) Sv <A) of elements of T[a@ as follows: 

Pix) = the least (ordinalwise) y € T,,,., such that x =ry. 

pi+i =the least y © T,,,, such that po=ry. 

And if lim(7), 
p= the unique y € T,, such that for all »< y, pi=,y (if it exists). 
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If the above definition breaks down (i.e. if p{ is undefined for some limit 
ordinal 7 <A), then the entire construction breaks down. But for the 
meantime, let us assume that this sorry state of affairs does not arise, and 
describe how b? is defined. Set b}={y € Tl a |S, <A (y =rp?)}. Clearly, 
b; is an a-branch of T[ a, containing x. 

We define T, as follows. If a¢ E, let T, consist of one-point extensions 
of each bf for x € Tl a. If aE E and S, is not a maximal antichain of 
T | a, do likewise. Otherwise, let T,, consist of one-point extensions of each 
b: such that x lies above an element of S,. 

That completes the definition of T = U,-,+ T.. Providing the definition 
of the p-sequence did not break down at any stage, T will clearly be a 
k*-tree. We’ll suppose that the definition of some p-sequence did break 
down. Let @ be the least limit ordinal such that for some x € Tf a, the 
sequence (p3| »(x) <= v <A) was not well-defined. Let 4 <A be the least 
(limit) ordinal such that p> was not defined. Now, since lim(7), y, is a limit 
point of C., so y,¢ E and C,, =7,NC. ={y | v <7}. Thus if we define 
(q3| p(x) =v <7) from y, as (p?| p(x) = vy <A) was defined from a, then 
for all v <7, qi=p%. But look, (q3| v(x) =v <7) determined the y,- 
branch b, and since y, € E, this branch has an extension on T,,. Hence 
the sequence (p? 


p(x) = v < 7) has an extension of T,,. Thus p; is defined, 
a contradiction. Hence T is a well-defined «*-tree, as required. 

We claim that T is Souslin. Suppose otherwise. Let A be a maximal 
antichain of T of cardinality «*. Set C={a € x* | TlaCak&ANnaisa 
maximal antichain of T[{a@}, a closed unbounded subset of «*. Pick 
a €COQOE such that A Na =S,. Then T, was constructed so that every 
element of T,. lies above an element of A M a. Hence A Na is a maximal 
antichain in T. Hence A = A Na, which is absurd. Hence T is Souslin. 


REMARK. By a combinatorial argument, one can show that in the presence 
of GCH, O, implies that there is a stationary set E C x* for which O, (E) 
and ©,+(E) hold. Hence GCH+0, is already strong enough to yield 
—SH(x*). The details of all of this (which followed some observations of 
Gregory) will be given in the forthcoming revised edition of DEVLIN [1973]. 


12. The fine structure of the constructible hierarchy 


The deeper results about the constructible universe, and in particular the 
proof of the combinatorial principle 0,, depend’ upon a detailed study of 


? See footnote 5. 


cu. B.5, §12] FINE STRUCTURE OF CONSTRUCTIBLE HIERARCHY 477 


the levels of the constructible hierarchy. This study, due to Jensen, is 
exceedingly technical, and in general it is not necessary to be familiar with 
all the details in order to apply the results yielded by the theory. We shall 
therefore content ourselves here with a brief description of the theory, and 
refer the reader to Devin [1973] for further details. (Actually, for 
technical reasons, the development in DEvLIN [1973] is carried out not for 
the constructible hierarchy as we have defined it here, but for a slightly 
modified hierarchy. However, the proofs can all be modified to cover the 
usual hierarchy, so we shall give our description here for the usual 
hierarchy.) 

Basically, the idea behind the fine structure theory is this. We have seen 
that the constructible hierarchy is 27" and uniformly 2f- for limit a > , 
that 2,-submodels of any limit L, are isomorphic to limit levels of the 
hierarchy itself, and that limit L.’s (a > w) have uniformly Zry- 2, Skolem 
functions. Thus, we can carry out condensation arguments, etc. in a 
uniform manner in order to obtain results about 2, predicates over the 
hierarchy. To carry out analogous arguments for 2, predicates is, in 
general, not possible. For example, although it is possible to construct “2, 
Skolem functions” for limit L.’s (a >), they tend to be rather compli- 
cated objects, and certainly not uniformly defined. But it is not hard to see 
that many of our nice, uniform results about limit L.’s and =, predicates on 
them carry through virtually unchanged for structures of the form (L., A), 
where lim(a), a > w, and y<a—-ANL, EL, (such (L.,, A) are said to 
be amenable ). What we do in the fine structure theory, is to show that, for . 
instance, a 2, predicate on some limit L, is “‘equivalent” (in some uniform 
manner) to a 2, predicate on some amenable (L,, A). We then work with 
this structure. 

The notion of amenability comes in as follows. If g is a 2» sentence of 
£,,(A) (the language £,, with the additional unary predicate A to 
denote A), then 


Faia? iff Ferceaney, antctraniey &- 


And, if (L., A) is amenable, then for all ¢, A M TC(ran(¢)) € L.. Hence, 
by a straightforward generalisation of the arguments used to prove 8.4 and 
8.5, we have: 


12.1. LEMMA. F%? ,. is uniformly Xt-* for amenable (L., A). 


(Lae A) 


The notion of a 2, Skolem function for amenable structures (L., A) is 
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defined much as in Section 8, and analogues of 8.1-8.3 clearly hold in this 
context. Moreover, by an argument as in 8.6, we have: 


12.2. Lemma. Let (L.,A) be amenable. Then (L,,A) has a uniformly 
Lite” S, Skolem function. 


We denote the canonical 2, Skolem function for (L., A) by h..4, and 
let? H, be the canonical 2» formula of £(A) such that y= 
ha ai, x) dz EL, [Fa a, Wi(Z ¥, X)) for any amenable (L., A). As in 
Section 8, (H; | i<wq) is recursive, and H,,4 is uniformly >{-*? (for 
amenable (L,, A) where H,,4(z, y,i,x)@[Fa.a A ¥, ¥)]. 

Let a > w, n>0. The %,-projectum of a, p;, is the smallest p = a@ such 
that there is a 2, (L.) map f with f’L, = L.. 

It can be shown that p2 is equal to the largest p <a such that (L,, A) is 
amenable for all A E>, (L.) A(L,), and also equal to the smallest p 
such that A(p) NZ, (L.) Z Le. 

It is easily seen that m <n— p< p™. Hence we set p° = a for alla. 

For each a > w, each n = 0, we can associate with a a standard code A*. 
and a standard parameter p3, with the following properties: 

(i) AZCL,» and ALEX, (L.). 

(ii) Al = pa=9. 

(iti) For all m >0, 2, (Los, Add) = P(Lon) N 2n+m (La): 

(iv) Suppose a > w, m =0, n=1, (Lz, A) is amenable, and 


7: (L;, A) <s,, (Loz, A®). 


Then there is a unique & = f such that p = p3, A = A%. Moreover, there is 
a unique 7 Da, 7:Ls <s,,,, La, such that for all i=n: 

(a) #(pi) = pa; 

(b) (# TL,s): (Lei, Ad) <s,,.,-.(Lpt, A 4). 

Notice that (for n > 0) by the definition of p2, any =, (L.) predicate on 
L, is reducible to a 2, (L.) predicate on L,». Hence, by conditon (iii) 
above, any 2,(L.) predicate on L. can be coded as a 2;((Los, A2)) 
predicate on L,:. And by condition (iv), we can carry out condensation 
arguments on structures of this latter form, using our uniform 2, Skolem 
functions h,,, defined above. 


* Of course, we have already used the same symbols H, in the case of the Skolem functions 
h,. However, there should be no cause for confusion in our notation. Indeed, h, is the same as 
h.., SO Our new usage can be regarded as an extension of the previous one, with the predicates 
H,, being a special case of the predicates H.,. «. 
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13. The combinatorial principle OO. 


The aim of this section is to prove 11.1. First of all, we remove the 
complication of having to consider the stationary set E involved in 11.1. 

For the whole of this section, « will denote a fixed infinite cardinal. Since 
O, is trivially true (in ZFC) for « =, we shall also assume that « is 
uncountable. (This is, in any case, the result required for applications.) 


13.1. Lemma. Assume O,. Then there is a stationary set E C x* such that 
a € E->cf(a)=wo and O,(E) holds. 


Proor. Let (A, [A <x* & lim(A)) be a O,-sequence. Let B, be the set of 
limit points of A, for each A. The B-sequence has the following properties: 
(i) B, is a closed subset of A; 

(ii) cf(A)} > w — B, is unbounded in A; 

(iii) yE B, > By =yNB,; 

(iv) cf(A)<«—>|B,|<k. 

Since cf(A) = w — otp(B,) < x, we can define a partition W = U,., W. 
of the set W ={a € x* | cf(a) = w} by setting W, = {AE W | otp(B,) = p}. 
Since W is stationary in «*, we can pick vo < « such that W,, is stationary. 
Set E = W,,. We show that O, (E) holds. 

Define (D, |A <«* & lim(A)) by: 


B,_ if otp(B,) = v%, 
Dy = 
B, -—{a@ © B, | otp(B.) = vo} otherwise. 


It is easily seen that the D-sequence has properties (i}-{iv) above. And 
clearly, D, N E =9@ for each A. 
Define (C, [A <x«* & lim(A)) by recursion, thus: 


ie |y€D,} if sup(D,) =A, 
C= 


U{c, | y ED} U{a, | n<q} otherwise, 


where (a, |n <w) is any w-sequence cofinal in A with ao = sup(D,). 

It is easily seen that the C-sequence is a 0.-sequence and that D, is the 
set of limit points of C, for each A. Hence the C-sequence is a O, (E)- 
sequence. (J 


Notice that in the course of the above proof we effectively established 
the following result: 
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13.2. Lema. 0, is equivalent to the existence of a sequence (B,|\A <x* & 
lim(A )) satisfying conditions (i)-(iv) of the above proof. 


13.2 gives the form of 0, which is perhaps more often seen in the 
literature. 
Set 


S={a |« <a<«* & lim(a) & Fi, “x is the largest cardinal’’}. 


It suffices to prove the following result: there is a sequence (C, la E S) 
such that: 
(i) C. is a closed unbounded subset of a; 

(ii) otp(C.) = «; 

(iii) if y is a limit point of C,, then y ES and C, =yNC,. 

To obtain O, from this result, we argue as follows. Let C2 be the set of limit 
points of C,. Identifying the closed unbounded set S with «*, we have: 
(i) C2 is a closed subset of a; 

(ii) if cf(a) >, then C2 is unbounded in a; 

(iii) otp(C2) S x; 

(iv) if y € C2, then CY= yN C2. 

If « is regular, then (iii) gives: cf(a)< « —>|C%|<«, so we are done by 
13.2. Otherwise, set K = cf(x) < x, let (0, | v < K) be a normal sequence of 
limit ordinals cofinal in x, and proceed as follows. 

Set 0, =« for convenience. Assume further that 4.=0, 06,=o. If 
6, <otp(C2) <= 64, set CL={y © C2] otp(y N C2)=4,}. If no such v 
exists, then otp(C2) = 9, for some limit ordinal » = x, and we can set 
CL={y © C0| 37 < v [otp(y N C2) = 6,]}. Then (Ci|a<x* & lim(a)) 
satisfies the requirements of 13.2, so again we are done. 

In order to establish the existence of the sequence (C.| a € S) as above, 
we require the fine structure theory of the previous section. We assume 
V =L from now on.** 


* Proofs using the fine-structure tend to be rather long and tedious, and the present case is 
no exception. We shall therefore omit most of the individual verifications, and try instead to 
cover the main line of the argument. This will still be fairly heavy going. We have included this 
sketch in this otherwise fairly easy-going account of L because we felt that, as fine-structure 
arguments play an extremely dominant role in present day constructibility theory, some idea 
of what is involved should be provided. Readers who do not wish to know the gory details can 
skip the rest of this section without affecting the reading of later parts of the article. 


° Actually, Silver has shown that the full power of the fine-structure theory is not required 
in order to prove C,. He observes that all we require is a functional hierarchy which resembles 
(to some extent) the hierarchy of the Skolem functions h,,, used in the proof of 0, presented 
here. The point being that in order to obtain such a hierarchy with sufficient properties to 
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Let a, B be ordinals, a = B. We say a@ is regular at B iff there is no 
L,g-definable map of a bounded subset of @ cofinally into a. For n a 
positive integer, we say @ is 2,-regular at B iff there is no X, (Lg) map of a 
bounded subset of @ cofinally into a. For a € S, we define: 

B(a)=the least B such that @ is not regular at B. 

n(a) =the least n such that @ is not 2,-regular at B(a). 

p(a)= pata} '; A(a)= Age)". 

Notice that, as a is ,;.)-:-regular at B(a) but not 2,,.),-regular at B(a), 
we have p3{2}= a <= p(a) for all a ES. In fact, it is not too hard to show 
that p33} = « for all a € S, a fact we shall use during the proof. 

We now partition S into two classes by setting: P ={a © S$ | n(a)=1& 
succ(B(a))}, R = S—P. The definition of C, will depend upon whether 
a € P or w@ER. In fact, if a € P, the definition of C, is extremely easy. 
For if a € P, then cf(a@) =, and we can take for C, any w-sequence 
cofinal in a. (Since C, will have no limit points in this case, there is nothing 
more to worry about.) That a € P—cf(a) = w is easily seen. There is a 
2.(Ls) map of a bounded subset of a@ cofinally into a. Taking some 
canonical =, representation of this mapping, the existential quantifier 
ranges over the definable subsets of L,, where B = y + 1. For each fixed 
formula of &, we can obtain a “‘part” of this function by regarding the fixed 
formula as providing the solution to the existential quantifier, letting the 
parameters concerned run over L,. But this ‘‘part”’ of the function is clearly 
L,-definable, so as y < B, it cannot be cofinal in a. Since the union of the 
countably many ‘“‘part” functions obtained as above is a function which is 
cofinal in a, we conclude that cf(a) = w. 

Suppose now that a € R. Set B = B(a), n = n(a), p = p(a), A = A(a). 
Since either n > 1 or else lim(B), we always have lim(p). Set A = h, a. Since 


obtain C., only “elementary considerations’ are necessary. (Silver calls his hierarchy a 
“machine”’.) Silver’s method not only proves 0, but also all of the best known consequences 
of the fine-structure theory. And though still fairly long, it avoids a great deal of the proof by 
cases involved in fine-structure derivations. Moreover, as we mentioned earlier, it can be 
developed using only standard facts about L, such as were considered in the first few sections 
of this paper. For these reasons, the Silver approach is undoubtedly superior to the standard 
fine-structure proofs, when all that one requires is the simplest and quickest rigorous proof of, 
say, 1, assuming only standard, well-known facts about L. The main drawback is that the 
‘“‘L-machine” has a somewhat ad hoc structure, without the simple intuitive motivation of the 
fine-structure theory. (Also, the Silver method is not well suited to the more complex 
fine-structure results, such as the construction of high gap morasses.) For this reason we have 
chosen to outline the fine-structure proof here. A similar account of Silver’s method would, 
we feel, be much less clear. Nevertheless, for the reader who would like to see a relatively 
quick, rigorous proof of (1,, we recommend the Silver approach. At the time of writing this 
article, Silver’s proof is not available in the literature, but an account of it will certainly be 
included in the forthcoming revised edition of DEVLIN [1973]. 
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Pe=k, we may let p=p(a)=the <,-least pE€L, such that L, = 
h"(w X (« x {p})), and such that a = (p) in case a < p, as well. 
For + < p, define h, by . 


y=h,(x)oy, x EL, & Jz EL, [Fa, a Hi (z, y, x)). 


Clearly, h. =h,,arc, for all amenable (L,, A NL.). 

Define a map g from a subset of « onto @ by: g(wyv + i) = h(i,(», p)), if 
this value is defined and lies in a, with g otherwise undefined. Clearly, g 
has a uniformly 2,-definition of the form: 


T= g(v)e(3z EL,) G(z,7, v), 


where G is >$%?({p}) (uniformly). For X CL,, set ax =sup(a  X). 
Define functions k:0—>k, 1:80—-a, m:60— , for some 6k, by a 
simultaneous recursion, as follows. 
k(v) = the least + € dom(g) such that g(r) >/(v) and |I(v)|'* < k. 
I(v) = ax,, where X, = hig)(w X (« X {p})). 
m(0) = max(x, 7 +1), where 7 is the least ordinal such that p © L,+1. 
m(A)=sup,<,m(v), if <p (otherwise undefined), for lim(A). 
m(v + 1)= the least y > m(v) such that gok(v)< y andA OL, EL, 
and 


I(v), m(v) € h4(w X (« X {p})), dz EL, G(z,gek(v), k(v)). 


It is easily seen that the above recursion only breaks down when, for some 
limit ordinal 6=«, sup,<em(v)=p. It follows that (I(r) | v<@) is a 
normal sequence, cofinal in a. 

Set C, = {l(v)| v < 6}. Notice that otp(C,) = @ = x. We must show that, 
if a is a limit point of C,, then @ © R and Cz; =aNC,. (We must show 
that @ € R, rather than just & € S, because of the somewhat arbitrary way 
we defined C, for A € P.) 

Let a@<a be a limit point of C,. Pick A<6 with a =I(A). Thus 
I(A) = sup,<,l(v). By means of the definition of k, it is not hard to show 
that a@ € S. (We have I(v)< g°k(v)<I(v + 1) for all v.) The rest of the 
proof is essentially a condensation argument. 

To commence, we notice that @ C X,. This is because « C X, and 
v<A—>|ax, |" <x. So, letting m:(Ls A) =(X%, AM X,), we have 


m3 (Ls, A) <s, (Lnay A MLmay)s afta=idfa. 


Since we also have (trivially) 7 :(L;, A)<z,(L,, A), there are unique B, 7 
such that 6 =p3"', A=A3”', #Da, and #:Lg<s,L, (at least). Set 
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p=7'(p). Since i (a@)= a. Hence & <p—>a& = (p)o. Let 
h =h,, a. Clearly, h = 2'ohaay? 7. 

It is now just a routine (but by no means nee matter to show (using the 
fact that 7: Lg <x, Lg, etc.) that pg = x, B = B(a@), n = n(a@), p = p(a@). In 
particular, @ € R. It follows that if g, k, i, m are defined from @ as g, k, 1m 
were defined from a, then for all v <A, #(&(v)) = g(v), 7(k(v)) = k(v), 
a(I(v)) = l(v), 7(m(v)) = m(v). In particular, since 7] a@ = id} @ Cz = 
aMC,, as required. 

It is precisely in order to establish the various facts involved here that the 
definition of the function m, in particular, was as complicated as it was. The 
key, initial step is to define a map g’ over (Lag), A A Lmay) using the 
predicate G. Since 7~'""g’ = g’ and k"\ C dom(g’), g’ is thus a D{# 4’ ({p}) 
map of a subset of « cofinally into @. The required equalities now follow by 
virtue of the canonical way the various parameters were defined. (The 
function g’ turns out to be g.) We omit any further details, since it should 
be clear by now that the whole thing is really just a condensation argument. 
(In this, and other respects, the proof is typical of fine structure proofs.) 


14. Weakly compact cardinals in L 


See Section 7 and 6.10 in Chapter B.3 for background on weakly 
compact cardinals. Using the fine structure theory, it is possible to obtain 
useful characterisations of the weakly compact cardinals in L. Firstly, by a 
modification of the proof of O, outlined above, we have: 


14.1. THEOREM (Jensen). Assume V =L. Let x be an uncountable regular 
cardinal, not weakly compact. Then there is a stationary set EC x and a 
sequence (C, [A <x & lim(A)) such that: 

(i) Cy is closed and unbounded in A; 

(ii) if y <A is a limit point of C,, then y€ E and C, =yNG,. 


The assumption that «x be not weakly compact in the above is clearly 
necessary, since a simple argument (in ZFC) shows that if « is weakly 
compact, and if E C « is stationary in «x, then EM A is stationary in A for 
some regular cardinal A < x (see Theorem 7.3 of Chapter B.3). (Hence no 
such sequence (C, [A <x & lim(A)) as above can exist.) These remarks 
provide us with our first characterisation. 


14.2. THEOREM (Jensen). Assume V =L. An uncountable regular cardinal 
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«x is weakly compact iff: whenever E C « is stationary in x, there is a limit 
ordinal A < x such that EQ A is stationary in A. 


By a forcing argument, Kunen has shown that the assumption of V = L 
in the above is necessary. (By which we mean that ZFC + GCH is not 
enough.) 

Using 14.1, a virtual repetition of the proof of 11.3 yields: 


14.3, THEOREM (Jensen). Assume V=L. If « is an uncountable regular 
cardinal which is not weakly compact, then there is a x-Souslin tree. Hence 
(since the weak compactness of x implies that there are no «-Aronszajn 
trees) an uncountable regular cardinal « is weakly compact iff there are no 
«-Souslin trees. 


Using 14.3, it is possible to obtain various other equivalences of weak 
compactness in L. We shall give three such, all in the field of infinitary 
combinatorics. 

As usual, if X is a set, [X]" denotes the collection of all n-element 
subsets of X. We write x —(A)}: iff whenever f :[«]"— yw, there is a set 
X Cx, |X| =A, such that | f’[X]"|= 1. (We say X is homogeneous for the 
partition f.) A well-known theorem of ZFC (6.10 in Chapter B.3) is: 


14.4. THEOREM. An uncountable cardinal x is weakly compact iff x > (x) 
iff VWn<aVp <x [k > (x)i]. 


Write « >[A)?.. iff whenever f :[«]"— yw, there is X C x, |X| =A, such 
that | f’[X]"|< 6. Clearly, if « is weakly compact, then x >[k]},. for all 
Bw, O<k., 


14.5. THEOREM (Martin; Shore). Assume V=L. A regular uncountable 
cardinal x is weakly compact iff k—>[k]j.. holds for some/all n =2, 
0<O0<p<k. 


Proor. Let T be a x-Souslin tree. Let 0< 6 < uw < xk be given. It suffices 
to show that the property x >[«]?.,. fails. We may assume that various 
levels of T-have been discarded, so that each member of T has at least pw 
immediate successors. Let S(x) denote the set of immediate successors of 
x, each x € T. Let f* :[S(x)}’— u — {0}, each x € T, where none of the 
resulting partition classes are empty. Define f:[T}/—> w as follows: If 
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yo, y1€ T are comparable, set f({yo, y:})}=0. Otherwise, let x be the 
greatest common predecessor of yo, yi. Pick x, © S(x) with x,<ry,. Set 
F({¥0. vib) = f* (x0, x13). 

Suppose that for some X C T, |X|=x, A = f"[XJ] has cardinality at 
most 6. Since T has no antichain of size x, we know that 0G A. Let Y be 
the set of all elements of T which lie in X or else below some element of X. 
Then f"[ Y]’ = A, of course. Hence, for each y € Y, there is an immediate 
successor of y which is not in Y. The set of all such is clearly an antichain of 
T, which is absurd, since |Y|=«. Hence f testifies the failure of 
k>[k]ie O 


A mapping f :[«]"—>« is said to be a set-mapping if f(0)¢o for all 
o €[x]". A set X Cx is said to be free for the set mapping f :[«]" > « if 
f'[X]" NX =O. We write (x, n)— A iff every set-mapping f :[«]" — « has 
a free set of size A. An old result of Erdés—Hajnal is: if « is weakly 
compact, then (x,n)—>« for all n€w. The proof is quite easy. Let 
f:[«]"— « be a set-mapping. Define a partition g :[x]"*'—n +2 by: for 
Xp <r Sdn SK, 


i where i is least with f({x1,..., Xi-1, Xist,.- + Xnei}) 
g({x1,..-5Xnsi}) = = x, if such an i exists; 
0 otherwise. 


Let X Cx, |X| =, be homogeneous for g. Clearly, g"[X]"*' = {0} (since f 
is a function). Hence X is free for f. 


14.6. THEOREM (Devlin). Assume V = L. A regular uncountable cardinal x 
is weakly compact iff (x,2)—> « iff Vn € w [(k,n)—> «]. 


Proor. Let T be a «-Souslin tree. Define f :[T}’— T by 


the largest common predecessor of x, yif x and y are not 
f(x, y}) = comparable in T. 
an arbitrary element of T — {x, y} otherwise. 


Suppose X C T is free for the set-mapping f on T. Then, for each x € X, 
there is clearly an immediate successor x’ of x such that Vye 
T (x'<ry > y€& X]. Thus {x'| x € X} is an antichain of T. Thus | X|< x. 
Hence f testifies the failure of («,2)—«. The theorem follows 
immediately. O 


The above two examples were really results about Souslin trees, of 
course, rather than V =L. This is not the case with our next result. 
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A graph is a structure © = (G, E), where G is a non-empty set and E is 
a collection of pairs from G, called the edges of ©. (If {x, y} © E, we say x 
and y are joined in &.) A subgraph of © is a substructure of © in the 
normal sense. If H C G, G| H denotes the subgraph of & with domain H. 
GH is small if |H|<|G|. 

Let G =(G, E) be a graph, p» a cardinal. A mapping h:G— yp isa 
w-colouring if h(x) = h(y)—>{x, y}¢ E. The least u such that G has a 
w.-colouring is called the chromatic number of ©, ch(G). Thus ch(@) is the 
smallest number of colours necessary to colour G, so that no two joined 
vertices have the same colour. 

By P(«), let us mean the assertion that if G is a graph of cardinality «, all 
of whose small subgraphs have countable chromatic number, then ch(@) is 
countable. It is easily seen that P(« ) holds whenever x is weakly compact. 


14.7. THEOREM (Shelah). Assume V = L. Let x > w, be a regular cardinal. 
Then P(x) holds iff x is weakly compact. 


Proor. Let « > w, be regular but not weakly comapct. By 14.1, there is a 
stationary set S C x, a € S > cf(a) = a, such that SM A is not stationary in 
A for any limit ordinal A < «. (Although we did not state it explicitly in 
14.1, the “‘indicated”’ proof yields a stationary set of limit ordinals cofinal 
with w.) We may assume that B <a — B+ w <a forall a € S. And by O,, 
there is a partition (B2|n<w) of a, each a <x, such that whenever 
(B, |n <q) is a partition of «, the set 


{aEx|cla)=0& Vn Ew (B, Na = B2)} 


is stationary in kx. 

For a € S, let A, be a cofinal w-sequence in a, chosen so that Wn [Bs 
unbounded in a >A, NBi4G]. Set G=(G,E), where G=« and 
{y,asEEeaES & vEA, for all v<a<k. 

We first show that ch(G) = a. Let f : x > w. Set B, = f-'{n} for each n. 
Let a exceed sup(B,) for all those n with sup(B,) < x. Let 


C={aEKx | a>a. & Wn[B, unbounded in kx > B, Na 
unbounded in a]}. 


C is clearly closed and unbounded in x, so we can pick a E C, cf(a) = w, so 
that Vn € w (B, N a = B?). Since a E C, a > a, so f(a) € B, for some n 
with sup(B,) = «x. Thus Bs is unbounded in a. Thus A. N BZA @. Pick 
v€ A, B%. Then f(v)=n and {v,a}€ E. Hence f is not a colouring 
of G. 
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We show next that ch(G{A)=~w for all A <x. It suffices to show that 
there is an enumeration (x, v <6) of A such that {n <v | {x,, x,} € E} is 
finite for all »y < @. (We can then colour @[ A by a simple induction along 
this ordering, using only countably many colours.) We prove this by 
induction on A. For A = 0 there is nothing to prove. For successor A the 
induction step is trivial. Suppose then that lim(A ). Let y = cf(A). Let CCA 
be closed and unbounded in A, otp(C) = y, CN S =@. Let (a, | v < y) be 
the canonical enumeration of C. For each »< yy, let <, be a suitable 
ordering (i.e. enumeration of @[(a,.:—a,). Define an ordering (i.e. 
enumeration) <, of G[ A by: x <,y ody < y[x <,y orx <a, < y]. This 
ordering of Gf A is as required. For suppose y <A. Pick v with a, sy < 
a@,+1. Suppose x <,y and {x, y}€ E. If x <a@,, then we must have yE S 
and x € A,, so y >a, and x lies in the finite set A, N a,. On the other 
hand, if a, =x, then x <,y, so x lies in the finite set {x <,y | {x, ys} BE}. 
The proof is complete. O 


15. Other results 


In this section we summarise, very briefly, one or two further results on 
constructibility. Firstly, what effect does the assumption V = L have upon 
model theory. Since GCH has a strong influence in this area, one might 
expect that V = L has an even stronger effect. This is indeed the case. Of 
the many results known to date, the most striking concerns the so-called 
“gap” theorems. Here, one takes some fixed, countable, first-order lan- 
guage S, containing a distinguished unary predicate U, and denotes by 
(x, A)—(k',A’) the following ‘“L6wenheim-Skolem Theorem’: Every 
S-structure % with |%]=« and |U"|=A is elementarily equivalent to a 
structure 6 with |%8| = x’ and| U®| =A’. An old result of Vaught says that 
if B =o, then (w.+p, @)—> (@,, ws) for all y, 6, assuming GCH. And by 
simple examples, it is easily shown that if 5>B (B<w), then 
(Qa +p) Oa) 7 (Wy +5, Wy). SINCE (Wa+p, Wa) —> (Wa +s, Wa) iS just a special case of 
the usual, downward Léwenheim-Skolem Theorem, if 6 < B, it follows 
that the only interesting cases left are those of the form 
(Wa +ny Wa) —> (Wg+n, We). ASSuming GCH, the only positive result known to 
date is Chang’s Theorem, that (@.+1, @.)—> (@ps1,@s) for ws regular. 
Assuming V = L, however, the problem has a complete solution. 


15.1. THEOREM (Jensen). Assume V=L. Then for all B and all n, 


(Wa +n Wa ) — (Wp +ns Wp). 
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The proof of ‘“‘Chang’s Theorem” for the singular cardinal case only 
requires the combinatorial principle 0. But all other cases of the above 
require extremely heavy fine-structure machinery (morasses). Some idea of 
this machinery is provided in DEvLIN [1973], where the easiest case is dealt 
with. 

Further results are also possible in the theory of trees in L. A «-Kurepa 
tree is a «-tree with at least x * «-branches. It is known from work of Silver 
that the existence of such trees is unprovable in ZFC, even with GCH. (See 
Theorem 5.7 of Chapter B.4.) In L, however, they are quite common. 
Specifically, we have: 


15.2. THEOREM (Jensen). Assume V =L. Let x be an uncountable regular 
cardinal. Then there is a «-Kurepa tree iff x is not ineffable. 


(A cardinal x is ineffable if, whenever f :[«]’— 2, there is a stationary 
set which is homogeneous for f. Thus ineffability is a generalisation of weak 
compactness. It is, however, much stronger than weak compactness.) 

The assumption V = L also has a strong effect upon the descriptive set 
theory of the continuum. This is mainly due to the following result of 
Godel: 


15.3. THEOREM (Gédel). Assume V =L. Then there is a A} well-ordering 
of P(w) of type a. 


Details of all of the above results are given in Devuin [1973]. 

Some of the more recent developments in constructibility theory have 
concerned not the “‘inside’”’ of L, but the relationship between L and V. 
The limitations of space do not allow us to elaborate, but it seems that 
there is no possibility of adopting a compromising approach to ‘““V = L”’. 
Either V is “‘very much like” L, or else ‘‘very unlike” L. We expect to 
include details of the relevant results (due to Jensen) in the forthcoming 
revised edition of DEvLIN [1973]. 


Historical Note 


The material covered in Sections 1 and 2 is due to Gédel. The results of 
Sections 3-7 were undoubtedly known to Gédel, though some of them 
perhaps only “implicitly”. The first explicit reference to the fact that the 
constructible hierarchy and its canonical well-ordering are 27" occurs in 
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work of Jensen and Karp, who used primitive recursive functions to obtain 
this result. The proofs given here are our own versions, though we make no 


claims as to originality. 
From Section 8 onwards, with only minor exceptions, everything covered 


is due to Jensen. 


References 


Devin, K.J. 
[1973] Aspects of Constructibility, Lecture Notes in Mathematics, Vol. 354 (Springer, 
Berlin). 
Devin, K.J. and H. JOHNSBRATEN 
[1974] The Souslin Problem, Lecture Notes in Mathematics, Vol. 405 (Springer, Berlin). 
GODEL, K. 
[1938] The Consistency of the Axiom of Choice and of the Generalised Continuum- 
Hypothesis, Annals of Mathematics Studies, Vol. 3 (Princeton Univ. Press, 
Princeton, NJ). 


This page intentionally left blank 


B.6 


Martin’s Axiom 


MARY ELLEN RUDIN 


HANDBOOK OF MATHEMATICAL LOGIC 
© North-Holland Publishing Company, 1977 Edited by J. Barwise 


491 


492 M. E. RUDIN [cu. B.6 


Martin’s axiom, known as MA, can be stated in a number of different 
ways. The topological form of MA is easy to remember. The Boolean 
algebra form is sometimes reassuringly familiar. But the partial order form 
is the useful one. It is an unfamiliar hurdle for nonlogicians. But it is more 
or less necessary for efficient use of MA; it is a first step in forcing 
techniques; and it is really not difficult. 

If (P, =) is a partially ordered set, then D CP is dense provided, for 
each p € P, there is a d€ D with p=d. A subset Q of P is compatible 
provided, for each finite subset F of Q, there is a q € P such that p = q for 
all p € F. ccc is read countable chain condition -and (P, =) is ccc provided 
every pairwise incompatible subset is countable. 


Martin’s Axiom: Suppose that (P, =) isa ccc partially ordered set and Q is 
a family of less than 2° dense subsets of P. Then there is a compatible subset 
Q of P which meets every member of 2. 


In Section 6 of Chapter B.4 Burgess proves: 


1. THEOREM. If ZFC is consistent, then ZFC + MA + —CH is consistent. 


MA is also known to be independent of ~ CH. 

We begin by proving that the above partial order form of MA implies the 
topological form; actually they are equivalent. 

A topological space is ccc if every family of disjoint open sets is 
countable. 


2. THEOREM (MA)*. If X is a ccc compact Hausdorff space, then X is not 
the union of less than 2° nowhere dense sets. 


Proor. Suppose that w <= A < 2°, that X is a ccc compact Hausdorff space, 
and that {X.}.c, is a family of nowhere dense subsets of X. 

Let P be the set of all nonempty open sets in X and partially order P by 
pxq if pCq. Since X is ccc so is P. For a€A, define D, = 
{p € P| pj X. = 9}; each D, is dense in P. Hence there is a compatible 
Q CP which intersects every D,. Q is a basis for a filter and X is compact, 
so there is an x € [\{q | q € Q}. For each a EA there is a q € OQ with 
GNX, =G, so x€ X,. Hence X4 UU... X.. O 


If CH, then Theorem 2 is just the Baire category theorem, so as no 
surprise: 


* We write “‘Theorem (MA)” to indicate that we use Martin’s Axiom to prove the theorem. 
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3. THEOREM. CH implies MA. 


Proor. Suppose that {D,},c. is a family of dense subsets of a ccc partially 
ordered set (P, =). By induction choose d, € D, such that d, > d,.; for all 
n. Then {d,},c. is a compatible subset of P which meets every D,. O 


A basic fact about partial orders is: 


4. THEOREM (MA + CH). If (P, =) is a ccc partial order and R CP is 
uncountable, then there is an uncountable compatible Q CR. 


Proor. Without loss of generality |R|= @,. Let 
G ={p € P||{q © R |p and q are compatible}| = w}. 


Let G* be a maximal pairwise incompatible subset of G. Let P*= 
{p € P|p is not compatible with any q © G*} and let R*= RM P*. 

Since (P, =) is ccc, G* is countable and R — R* is countable. Thus there 
is a one-to-one indexing {q.}acs, of R*. 

Let P’ be the set of all finite, compatible in (P, =), subsets of P*. 
Observe that if FE P’ and p =q for all p€ F, then q € P*. Thus P’, 
partially ordered by reverse inclusion is ccc. 

For each a € a, define D, = {F € P'| FN {qp}a>a¥% OH. 

For each F € P’ there is a q € P* with p =q for all p € F. Since q is 
compatible with uncountably many gp, each D, is dense in P’. So there is a 
compatible Q'C P’ which meets every D,. Thus Q = U Q’ is an uncount- 
able compatible subset of P. O 


Recall (see Chapter B.3) that a Souslin tree is an uncountable tree with 
no uncountable chains or antichains. In a tree a compatible set in reverse 
order is a chain; so Theorem 4 yields: 

5. THEOREM (MA + —7CH). There is no Souslin tree. 

A Souslin tree can be used to build a Souslin line — a connected, linearly 
ordered, ccc space which is not separable. It is well known that the product 
of two Souslin lines is not ccc. However: 


6. THEOREM (MA+—CH). The product of any family of ccc spaces is ccc. 


Proor. By asimple A-system argument (not using MA, see Theorem 5.8 of 
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Chapter B.3), if a product is not ccc, then some subproduct of finitely many 
factors is not ccc. So to prove Theorem 6 we need only show that the 
product of any two ccc spaces is ccc. 

Suppose that X and Y are ccc and that {U. x V.}.c., are disjoint, 
nonempty basic open sets in X x Y. Let P be the set of all nonempty open 
sets in X partially ordered by inclusion. By Theorem 4 there is an 
uncountable compatible subset Q of {U.}.¢.,. But if U. and U, belong to 
Q, U. N Us4# Bso0 VN Vag = G. Thus {V, | U. € Q} is a family of disjoint 
open sets which contradicts the fact that Y is ccc. O 


MA was first discovered by MarTIN and So.ovay [1970], who observed 
that MA was inherent in a number of previous proofs of the consistency 
that there be no Souslin trees. Their original paper is a beautiful explana- 
tion of MA which I recommend. They stress the usefulness of MA as a 
viable alternative to CH. They point out that many of the traditional 
problems solved using CH can be solved using MA alone. Frequently the 
part of CH that is used is only MA. But almost equally often, a statement 
true under CH can be proved false under MA + “CH. Since MA + CH 
is consistent with ZFC, any such statement is itself independent of ZFC. 

MA is severely limited by the fact that it only says something interesting 
about cardinals A where w <A <2*. However these are precisely the 
cardinals which most often cause grief for nonlogicians. Certainly general 
topologists have found MA applicable to a rich variety of their problems. 
Those discussed by Juhasz in Chapter B.7 give some feeling for its use and 
we will not attempt to give references for the multitude of other topological 
uses of MA. 

An analyst who works with Banach spaces recently asked me two 
questions. Must a compact Hausdorff space of cardinality = 2° be sequen- 
tially compact? Must a ccc compact Hausdorff space with a point countable 
separating family of open F,’s be metrizable? Neither question can be 
answered in ZFC. The analyst already knew that MA + “CH implies that 
the answer to both questions is yes. Analysts too are beginning to use and 
recognize MA. 

An important recent result using MA is in algebra. Shelah has recently 
proved that Whitehead’s problem is undecidable in ZFC: if V = L, then 
every W-group is free; but if MA + CH, there is a W-group which is not 
free. There is an excellent description of Whitehead’s problem and 
Shelah’s solution written for the Monthly by Extor [1977] so we will avoid 
the necessary definitions and heartily recommend the reading of Eklof’s 
article. 
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Another recommended Monthly article, written by SHOENFIELD [1975], 
gives an elegant, elementary discussion of MA and proves many of the 
same theorems given here. 

Shelah’s solution of Whitehead’s problem illustrates the relationship 
between MA + “CH and V =L often seen in topology. Roughly speak- 
ing, MA+—CH completely unravels the area between w = 2° while 
V=L holds it completely rigid. Anyway these two axioms are strongly 
contrasting and often yield contradictory theorems. 

Let us turn now to some of the combinatorial consequences of MA: 


7, THEOREM (MA). Suppose that & and & are families of cardinality less 
than 2° of subsets of w such that, for all finite subsets € of & and elements B 
of 8, B — U € is infinite. Then there isan M Cw such that B — M is infinite 
for all BE B but A — M is finite for all A € &. 


Proor. Index # ={A.}se, and @ ={B,}.<, for some A <2*. Let P= 
{(H, K) | H is a finite subset of A and K is a finite subset of w}. Define 
(H, K)=(H', K’) in P provided HCH', KCK’, and (K’—K)N 
Wc = p. 

Any uncountable subset of P has two members with the same second 
element, say (H,K) and (H’',K). Since (H, K)=((H UH’), K) and 
(H', K)=((H UH’), K), P is ccc. 

For each a€A, define D, ={(H,K)€ P|a€H}. For a€A and 
n € o, define E,,, = {((H, K)€ P||KB,|>n}. It is easy to check that 
each member of J = {D, |a € A} U{E,., |@ € A and n E a} is dense in P. 
Since |@| = A < 2°, MA implies that there is a compatible subset QO of P 
which meets every member of &. 

Define M=w— U{K |(H, K)€ QO}. If a GA there is (H, K)EQN 
D.. If (H’, K') is any other member of Q, since Q is compatible, there is 
(H", K”)€ Q such that (H", K") <(H, K) and (H”", K") <=(H’", K’). Since 
K'CK" and (K"— K’)CA.,, (K'— A.) CK. Thus A, - MCK so A.-M 
is finite. 

Since for each n € w there is an (H, K)€ Q with|KNB,|>n, B.-—M 
is infinite for alaEa. O 


The proof of Theorem 7 is typical of MA proofs. With A = w, this 
theorem is a frequently used fact which does not need MA in its proof. But 
MA allows us to extend the theorem to all cardinals less than 2° although 
such an extension would be false without some set theoretic assumption 
beyond ZFC. 
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Two direct consequences of Theorem 7 which are sometimes more 
immediately applicable are: 


8. CoROLLARY (MA). If 8 is a family of cardinality less than 2° of subsets 
of w with each finite subset of B having infinite intersection, then there is an 
infinite subset L of w such that L — B is finite for all BE B. 


Proor. Define #* = {(w — B) | B € B} and B* = {w}. By Theorem 7 ap- 
plied to *, B*, there is an M such that A — M is finite for all A € * 
but w — M is infinite. Define L=w-M. DO 


9. COROLLARY (MA). Suppose that # and & are families of cardinality less 
than 2° of subsets of w such that B— U @ is infinite for all B € ® and finite 
€ CB. Then, if |x UB|=A, there is an infinite subset L of w such that 
AN L is finite for all A € & and BL is infinite for all BE &. 


Proor. By Theorem 7 there is an M Cw such that A — M is finite for all 
A € & and B — M is infinite for all B € &. Define L = w — M. Then, for 
BE 8,B-M=B-(#-L)=L-(#- B)=LNB is infinite. Similarly 
LNA is finite for al AE # DO 


A consequence of Corollary 8 is: 
10. CoroLLtary (MA). If A <2*, then 2* is sequentially compact. 


Proor. Let {f,}ne. be an infinite subset of 2*. We show that, if f is an 
arbitrary limit point of {f,}ne., then there is an L Cw such that {f,}nex 
converges to f. 

Let G be the set of all functions into 2 which f extends whose domain is 
a finite subset of A. For gE G let B, ={n€w|f, extends g}. Since 
|G| =A, the hypotheses of Corollary 8 are satisfied and there is an infinite 
L Cw such that L — B, is finite for all g € G. L clearly has the desired 
property. O 


Actually, in Corollary 10, 2* may be replaced by any compact Hausdorff 
space of cardinality less than 27” (see Corollary 1 to Theorem 1.7 in Chapter 
B.7). 

For an application of Corollary 9, define F = {f : w > w} with f<g in F 
provided there is an n € w with f(k)< g(k) for all k > n. A subset G of F 
is dominating provided, for évery f€ F there is a gEG with g>f. 


cu. B.6] MARTIN'S AXIOM 497 


{fa}a<,CF is called a scale if it is dominating and a < 8 < A implies that 


fa < fe. 


11. CoroLttary (MA). Every dominating family has cardinality 2”, and 
there is a scale. 


Proor. Suppose that A < 2° and that {f.}.e,C F. For a € A, define A, = 
{(i,j)€ w? | j <f(i)} and for iG w define B, = {i} w. By Corollary 9 
there is an L Cw’ such that all A, ML are finite and B,  L are infinite. 
Choose f € F with (i, f(i)) € L for alli € w. Then f, <fforalla€a. DO 


A trivial consequence of Corollary 11 is that 2° is regular. A conse- 
quence of Corollary 9 (which was first used to show the consistency of a 
counterexample to the normal Moore space conjecture) is the following. 


12. THEOREM (MA). If X is a subset of the real numbers of cardinality less 
than 2°, then every subset of X is a relative Gs set. 


Proor. Suppose that Y CX and that {U,},<. is a countable open basis for 
the real numbers such that no pair of distinct numbers is in the intersection 
of infinitely many U,’s. 

Index Y ={ya}aca and X— Y={x.}aea: we assume without loss of 
generality that neither is empty and repetitions do not matter. For a € A, 
let B, ={n€w|y, € U,} and let A. ={n€w|x. € Uz}. Define # = 
{A.}.e, and @ ={B,}.e,. By Corollary 10 there is an L Cw such that 
LOB, is infinite and LNA, is finite for all a € A. For n€ w define 
L, = U.,nem Un. Since each B, NL is infinite, ys E Mn,e.L,. But each 
A. OL is finite so x1€ MncoL, Thus Y is a relative G,; O 


If X is infinite, the cardinality of the set of all G; sets in X is 2”, and the 
cardinality of all subsets of X is 2*. Thus Theorem 12 yields: 


13. CoroLLary (MA). If @ <A <2”, then 2” = 2°. 
Another Baire category type theorem is: 
14. THEOREM (MA). Suppose that 0<A <2”, that X is a space with a 


countable basis, and that {X.}.c, is a family of nowhere dense subsets of X. 
Then U.,X. is the union of countably many nowhere dense sets. 
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Proor. Let {U,},c. be a basis for the topology of X: make sure that each 
basis element is indexed with infinitely many different n. Then define 
B, ={m €w|U, CU,}. For each @€A_ define A. ={nEa| 
U, 1X, 4 GB}. Let 8 ={B, | n € w} and # ={A, | a € A}. By Corollary 9 
there is an L Cw such that LQ B, is infinite for all n € w but LNA, is 
finite for all a € A. Thus if Y, = X- U{m EL |m>n}, Y, is nowhere 
dense and, for each a € A, there is an n with X, C Y,. Therefore U.e, X. 
is the union of countably many nowhere dense sets. O 


Remark. All the consequences 7-14 of MA were in fact proved from 7. 
Actually, van Douwen has shown that the simpler 8 implies 7 (and hence 
7-14). To see this, let of, @ be as in 7, and let 


D ={[w —n]<* |[nE w} UL{s €[w]<" [sO B4G}|BE B} 
Uf{[w - A]? |A € } 


(where [I]<* is the set of all finite subsets of I). Apply 8 to 9 to get an 
L C[w]** with L —X finite for all X € 9, and lett M=w- UL. It is 
known that 4-6 and 15-17 do not follow from 8. 


Baire category theorems, Borel hierarchy, and measure theory type 
theorems go together. One consequence of the following theorem is that 
the union of any family of less than 2° measurable sets of real numbers is 
measurable. We use R for the real line and m(X) for the Lebesgue 
measure of X. 


15. THEOREM (MA). If 0A <2” and for each a€A, X.CR and 
m(X,) = 0, then m(U., Xa) =0. 


Proor. Suppose that « <0. We prove Theorem 15 by showing that there is 
an open subset Y of R such that m(Y)=e and X, CY forall aA. 

Define P = {U CR | U is open and m(U) < ¢} and partially order P by 
reverse inclusion. 

Define @ to be a countable family of open intervals which form a basis 
for R. Let @* be the set of all finite unions of members of 2. 

If S is an uncountable subset of P, there is an uncountable subset S’ of S$ 
and a 0<n€~w such that UE S’ implies that m(S)+1/n < e. For each 
UES’ choose U* CU such that U* € B* and m(U — U*)< 1/n. Since S’ 
is uncountable and &* is countable, there must be different terms U and V 
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of S’ with U* = V*. Then U and V are compatible since m(U U V)< e; 
thus (P, C) is ccc. 

For each a € A define D, = {U € P | X. C U}. Since D, is dense there is 
a compatible Q C P which meets every D,. Let Y = U @. For all a EA, 
X, CQ. Since R is hereditarily Lindel6f, countably many members of Q 
cover Y so, if m(Y)>e, there is some finite subset Q’ of Q with 
m(U Q')>«. However, since Q is compatible, there is a Y’€ P with 
U Q’CY’. Since m(Y’) < ¢ we have a contradiction. Thus m(Y) = e« and 
Y has all of the desired properties. O 


An Aronszajn tree is a tree (T, =) of cardinality #, having no uncount- 
able level or chain. A Souslin tree is an Aronszajn tree in which every 
antichain is countable (see Chapter B.3). MA + —CH denies the existence 
of a Souslin tree in an especially strong way: 


16. THEOREM (Baumgartner; MA+ CH). Every Aronszajn tree is the 
union of countably many antichains. 


(Such an Aronszajn tree is called special.) 


Proor. Let (T, =) be an Aronszajn tree. Our aim is to define a function 
q:T—Q (where @Q is the set of all rational numbers) such that s <¢ in T 
implies that q(s)< q(t). Since q™‘(r) for each r€Q will then be an 
antichain, Theorem 16 will then be proved. 

Let P={f:S—Q|S is a finite subset of T and s<t in S implies 
that f(s)< f(t)}. Define f = g in P provided g extends f. For each t € T 
define D, = {f € P| t € domain of f}. Each D, is dense in P and |T|= , 
so, if P is ccc, there is a compatible P'C P which meets every D,. Thus 
there is a g : T > Q@ which extends every f € P’, and this q has the desired 
properties. 

It remains to prove that P is ccc. Assume that {f.}.<.,C P and that the 
domain of f, is S.. We prove that P is ccc by showing that there are a < B 
in w, and f € P such that f extends both f, and fp. 

By a A-system argument (see Chapter B.3) we can choose an n Ew, a 
k € o, and an infinite (uncountable) subset M of , such that: 

(a) For all aE M, S, has n terms Soa, Sia) -- +5 Sn—t,a- 

(b) a# B in M andi €n imply f.(s..) = fe(Sis)- 

(c) a¥ B in M andi€k imply 5, = Si. 

(d) a < Bin M andi andj inn —k imply that the level in T to which s,, 
belongs precedes the level in T to which sj belongs. 
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(e) 1 €n-—k implies that {s..}.e4 is an antichain. 

To see that you can get (e), recall that if M’' is any uncountable subset of 
w,, then {s,.}.ea Must contain an uncountable antichain since otherwise 
M’ (with the induced order) would be a Souslin tree which is denied by 
MA+—-CH. 

Suppose that (i,j) is a pair of numbers in n — k. We now use Ramsey’s 
theorem: w —(w)} (see Chapter B.3). Let A = {pairs (a, 8B) of terms of 
M|a<fBands,; < s,;} and let B = {pairs (a, B) of terms of M | a < B and 
Sai & Sp}. There can be no a < B < y all of whose pairs are in A since then 
Sai < S,; and ie s,, which by (d) would mean that s,; < ss; would 
contradict (e). So there is an infinite M’ C M all of whose pairs are in B. 

For different i and j in n—k choose successively smaller infinite M’s 
until we have an infinite M* CM such that for all a < B in M* and i andj 
in n—k, Si: Ss; Then, for a < B in M*, f:(S, USs)— Q defined by f(s) 
is f.(s) for s € S,, and f,(s) for s € Sg is well defined and belongs to P and 
extends both f, and fs. O 


A similar proof (with a different proof that P is ccc) would show that 
under MA+ “CH, every Aronszajn tree (T, =) is normal (under the 
topology induced by taking all sets of the form {x € T|s < x <t} where 
s<tin T as a basis). 

We close with a combinatorial theorem. A family »& of sets is called 
almost disjoint if the intersection of each pair of distinct elements of #& is 
finite. 


17. THEOREM (Wage; MA). Suppose that x is regular and w<K SA <2”. 
If & is a family of 4 almost disjoint countable subsets of x, then there is a 
B Cx of cardinality k such that & U{B} is almost disjoint. 


Proor. Let # = {A.}.c,. Define P = {(H, K)| H isa finite subset of and 
K is a finite subset of «}. Partially order P by (H,K)=(H', K’) in P 
provided H C H', K CK’, and (K'- K)N(U A) =9. 

Let {(H., Ka)}.ew, be an uncountable subset of P. By a A-system 
argument (see Chapter B.3), there is an uncountable M Ca,,n € w, HCA, 
and K Cx such that a 8 in M implies that H, N Hg = H and K, N Kg = 
K, and a € M implies that K, — K has n terms. Observe that {K. — K}aem 
are disjoint and {U (H.-—H)}.em are almost disjoint. Let C be any 
infinite countable subset of M. Since each U.ecH. is countable and 
{K. — K}.em are disjoint, we can choose an L C M such that |L|>n and 
(Unec U Ha) N(Uper Ks — K) =@. Since |L|>n, |K.-K|=n1 for all 
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a € C, and { U (HM, — H)}sex are almost disjoint, there is an a € C and 
BEL such that (K,-K)N(U(H,-H))=9. Hence (H,, K.)= 
((H. U Hz), (K. U Kg)) and (Hg, Kg) = ((H. U Ha), (K. U Kg)). Thus P is 
ccc. 

For a€A, let X, ={(H,K)€ P|a€H} and, for BEx« let Y,= 
{(H, K) € P |(« — B)M K# 9}. Since both X, and Y, are dense in P for all 
a €A and B E x, there is a compatible Q which meets every X, and Ya. 
Let B= U{K Cx |(H, K)€ Q}. Since ON Y,# @ for all B E x, |B| =. 
If a EA, there is an (H, K)€ Q 1 X.,,. If (H’, K') is any other member of 
Q, there is (H”, K”)€ P with (H, K)=(H", K") and (H’, K')=(H", K"). 
Since (K"— K)N A, =@ and K’CK”, K'N A, CK. Thus BN A, CK so 
BNA, is finite. O 


From this mixed bag of theorems can we generalize about when we 
should expect MA to be useful? In topology MA + —CH can be used to 
construct a variety of normal but not collectionwise normal spaces. MA can 
be used to deny the existence of certain pathologies in ccc spaces. MA 
often has something to say about problems involving compact Hausdorff 
spaces. For instance many of the traditional theorems about BN (the 
Stone-Cech compactification of the integers) proved using CH have a more 
general version proved using only MA. Baire category and measure theory 
theorems proved traditionally for countable sets can often be extended to 
sets of cardinality less than 2°. And in any field of mathematics when one 
would like to prove a theorem for , sets and one knows an inductive 
proof of the theorem for countable sets, there is a natural setup for 
applying (MA + CH). The difficulty is in proving that this natural partial 
order is ccc. It may not be ccc, in which case MA may not be applicable. 
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Introduction 


It is not at all surprising that developments in set theory affect topology. 
This is true about the theory of the most familiar topological spaces, like 
that of the reals and of its subspaces, as well as about the general theory of 
topological spaces. The most striking examples in the first area are the 
recent advances in descriptive set theory (discussed in Chapter C.8) while 
the purpose of the present paper is to exhibit results of the second sort. 

The underlying theme of the paper is to introduce certain set-theoretic 
assumptions (like Martin’s axiom, ©, etc.) unearthed by set theorists (who 
have proved their consistency) and then illustrate how these can be used in 
proving topological theorems or constructing counterexamples. This we do 
without worrying about how the actual consistency of these principles is 
established. There is really nothing strange, or even new, in this practice; 
think of the numerous results obtained with the use of the continuum 
hypothesis by people who had no idea how to prove its consistency (or did 
not even know it was consistent). 

There are several reasons for the promotion of this practice among 
topologists. First of all, it turns out to be ‘‘nice mathematics’, so it serves 
the aesthetic pleasure of those fond of set-theoretic methods. Secondly, by 
accumulating “‘data’’, it might contribute to the better understanding of the 
“nature’’ of set theory itself. 

The notation used in this paper follows Chapter B.3 and JunAsz [1971]; 
one exception is the tightness which is denoted by ¢(X) instead of a(X). 
Also the notion of the 7-character 7y(p, X) of a point p in a space X is not 
defined there, it is the smallest cardinal of a family ? of non-empty open 
sets in X such that every neighborhood of p contains a member of ¥. (We 
recall that a m-base # for a space is the global analogue of this, i.e., a 
family of non-empty open sets such that every non-empty open set contains 
a member of Y, while the w-weight w(X) of a space X is the smallest 
cardinal of a global a-base for X.) 

We shall say that a space is x -Baire if it can not be written as the union of 
less than x nowhere dense subsets. 2°-Baire spaces are also called strongly 
Baire. 


1. Applications of Martin’s axiom 


1.1. z-completeness 
As is shown, e.g. in MARTIN and SoLovay [1970], MA is equivalent to the 
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statement that every compact T, space with the Suslin property satisfies the 
strong Baire property, i.e. is not the union of less than 2” nowhere dense 
subsets. It turns out that compactness can be replaced by a much more 
general ‘“‘completeness’’ property here, which is not surprising since we 
know that the Baire property is more closely related to completeness than 
to compactness. 

The property we are about to introduce is quite general, it includes e.g. 
all Cech-complete spaces as well as all almost subcompact spaces (cf. 
Aarts and Lutzer [1974]). We recall that a filter base ¥ in a space X is 
called regular if F,,F,€ ¥% implies the existence of FE ¥ such that 
F Clnt(F, 2M F,). 


Derinition. A regular space X is called a-complete if it has a family 
{P.: a <A} of w-bases with A < 2° such that whenever ¥ C U{P,: a< 
A} is a regular filter base with | #|<2° and FN Y, #9 for each a <A, 
then N FAO. 


Remark. The restriction to regular spaces is not essential, everything 
below goes through without regularity if instead of 7-bases we would take 
families of open sets with the property that every non-empty open set 
contains the closure of a member of them. 


1.2. THEOREM (MA). Every a-complete space X with the Suslin property 
(i.e., with c(X)=w) has the strong Baire property. 


Proor. Let {P.:a<A} be the family of wa-bases required for 7- 
completeness and put ? = U {P,: a <A}. We define a partial order < 
on # by letting P< Q iff PCQ and P# Q. Clearly, (P, <) is a CCC 
partial order. 

Now let « < 2° and assume that {A;: € < «} are nowhere dense subsets 
of X. We shall show that X# U {A,: & <x«}. For every a<A and é<x 
let 


Da,g = {P © Pa: PM Ag = O. 


It is easy to see that every D,,, is dense in the partially ordered set (P, <). 
So by MA there exists @ CP which is generic over the family {D..¢: a <A, 
& < x}. Genericity in (P, <) obviously implies that @ is a regular filter base 
in X which intersects every , for a < A and has a member missing A, for 
every € <x. Now an easy “Lo6wenheim-Skolem type’’ (see Chapter A.2) 
argument yields us a regular filter base ¥ CY with | F#|<2° and having 
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the above properties of G. By the definition of 7-completeness we have 
then 


BA NFCXU{A,: E<k}, 


which proves the theorem. Let us remark that if 2° = w, (i.e. CH holds) 
then the result remains valid without the assumption c(X)= 0. O 


CoROLLARY (MA). If X is 7-complete and c(X) = w, then X° has the strong 
Baire property. 


Proor. If {P,: a <A} is a suitable family of m-bases for X, let us define 
P..n for a <A and n < w as the set of all elementary open subsets of the 
product X° of the form 


NM {77 (P): iE Dh, 


where IE [w]**, P, EY, for each i€ I, and n€ I. Then each F,,, is a 
a-base of X°, moreover it follows easily from the definitions that if 
FCU{Pi.nta<A, n<o} is a regular filter base with |¥|<2” and 
¥ 1 P...#O for each a and n, then ¥, ={2,(P): P € F} is a regular filter 
base contained in U{%,: a <A} intersecting every #, and having 
| ¥,|<2*, consequently  ¥, = 1 {7,(P): PE ¥} #9 for each n <a, 
hence [1 ¥49. This shows that X° is 7-complete. If CH fails we also 
have c(X*)= w by Theorem 6 of Chapter B.6 on MA and the fact that 
c(X) = w. Thus we can apply Theorem 1.2 and the remark at the end of its 
proof to conclude that X° has the strong Baire property. O 


The following result does not use MA and thus might be of interest in 
itself. 


1.3. THEOREM. Let X be any space and « be an infinite cardinal. If X° is 
k-Baire and 7(X)<-k, then d(X)=w (i.e. X is separable). 


Proor. Let # be a z-base for X with | P| < «x. For any fixed P € F let G 
be the family of all elementary open subsets of X° one of whose factors is 
equal to P, i.e., 


Gp = { A) ai(G): 1€ [wo] & FET st. G - P| 


Clearly, Gp is a 3-base for X*, hence Dp = U @p is a dense open subset of 
X*. By the «-Baire property of X° we have D = (1 {Dp: PE P} ¥Q, so 
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let x € D. We claim that S$ = {x, = 7,(x): n € w} is dense in X. Indeed, let 
G CX be an arbitrary non-empty open set in X. Then there is P € # with 
PCG, and since x € Dp there is n€ w such that 7,(x)=x,E PCG, 
proving that S is dense in X, hence d(X)=o. O 


COROLLARY (MA). If X is w-complete, c(X)=o, and m(X)< 2°, then 
d(X) =o. 


Proor. The proof is immediate from Corollary 1.2 and Theorem 1.3. 


1.4. REMARK. This last result is the basis for proving a number of theorems 
of the following kind: (MA + — (CH) implies that if X is complete, Suslin 
and has “‘small’’ local characters, then X is separable (cf. HAINAL and 
JunAsz [1971], Sapmovsku [1972], TALt [1974]). The next result is a very 
general one of this sort. We omit its proof because it uses only ‘“conven- 
tional’’ techniques to deduce the result from the Corollary to Theorem 1.3. 


THEOREM (MA). Suppose X is such that 
(i) every closed subspace of X is m-complete ; 
(ii) ¢(X) = ; 
(iii) t((X)" <2° and ay (p, X) < 2° for every p € X. 
Then X is separable. 


Cech-complete spaces are examples of spaces satisfying (i). Sapirovskii 
proved (unpublished) that for X compact, 7y(X) = t(X). Therefore as a 
consequence of the above theorem we get: (MA + 4 CH) implies that if X 
is compact, Suslin and t(X)=., then X is separable. It is not known 
whether t(X) = @ could be replaced by zy(X) = w in this result. 

The previous results seem to need the “‘full force”’ of MA, while the ones 
we shall look at below all depend on a combinatorial consequence of MA 
that is known to be strictly weaker than MA (cf. KuNnen and TALL 
[1977]). 

So let us recall (Theorem 8 of Chapter B.6) that the following proposition 
is a consequence of MA: 


If £ C P(w), | | <2° and| M @|=w whenever BC & is 

(*) finite, then there is an infinite S C w such that SA is finite 
for all A € & (i.e. S is almost contained in every member of 
ot). 


The following result is an immediate consequence of this proposition. 


508 JUHASZ/CONSISTENCY RESULTS IN TOPOLOGY [cH. B.7, §1 


1.5. THEOREM (MA; cf. HEcHLER [1976]). Let X be a separable and 
countably compact space and U be an open cover of X with |U|<2°. Then 
there is a finite subfamily V CU such that U Y is dense in X. 


Proor. Let S C X be a countable dense subset of X and assume that % 
has no finite subset VY with dense union. Then the family 


#4 ={S.U: VEU} 


of subsets of S clearly has the property that any intersection of finitely 
many members of # is infinite. Hence using (+) we can find an infinite 
subset CC S with C\(S.U) = CN U finite for each UE &%. But then C 
can have no limit point in X, since &% is a cover, which contradicts the 
countable compactness of X. [ 


The next result of W. Weiss follows from Theorem 1.5, and is especially 
interesting because of a later result that “‘confronts’’ it. 


1.6. THEOREM (MA + CH). Every countably compact, perfect, and regu- 
lar space X is compact. 


ProoF. It is sufficient, of course, to show that X is Lindeldf. So assume, 
striving for a contradiction, that X is not Lindel6df. It is shown e.g. in 
STEPHENSON [1972] that a perfect and countably compact space has 
countable spread, i.e. s(X)=w. Since X is not Lindel6f, it contains a 
right-separated subset R = {pp: &<@,} of type w,, moreover s(R)= 
s(X)= wo implies that R is actually hereditarily separable (cf. JUHAsz 
[1971]). For each € < w, there is a neighborhood U; of p, in X with p, € U, 
fot n > &, where the regularity of X is also taken into account. Since X is 
perfect, the open set G = U {U,;: & < w,} is an F,, hence there are closed 
sets {F,: n <} such that G= U{F,: n<w)}. There is an no<@ then 
with |R AF, |= 1. 

Let us put Z=RNMF,,, then Z is countably compact being closed in 
X and separable because its dense subset RMF, is. Finally U ={U;: 
€ <,} is an open cover of Z with | & |< 2°, hence Hechler’s theorem can 
be applied and yields us a finite 


Y ={Ug,...,Ue}CU such that ZC U Ug. 
k=1 


But if & is the largest among the & (k =1,...,) then this union contains 
no p, with n > &, contradicting that the uncountable set R M F,,, should be 


contained in it. O 
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Finally, we look at some completely different consequences of this same 
combinatorial principle. The following result is due to MALYHIN and 
Sapirovskil [1973]. 


1.7, THEOREM (MA). Let X be any space and p € X and S C X be such that 
p © S\S and |S| =. Then there is a sequence of points from S converging 
to p provided that either of the following two conditions is satisfied: 

(i) xp, 8)<2°; ; 

(ii) X is regular, countably compact, and (p, S)< 2°. 


Proor. We can assume, without loss of generality, X = S. Now let U with 
|%|<2*° be a system of neighborhoods of p in X such that W is a 
neighborhood basis for p if (i) holds, or 


NM{U: VE U}={p} 
if (ii) holds. Next we define a family & of subsets of S by putting 


A={UNS: UE%X} 
if case (i) holds, and 
&A={UNS: VEU} 


if case (ii) holds. Clearly, any finite intersection of members of is infinite 
in both cases, since p cannot be isolated. Therefore proposition (*) can 
again be applied to obtain an infinite subset M C S, M = {q,: n < w} such 
that M.U is always finite in case (i), and M.U is always finite in case (ii). 
This of course immediately implies that the sequence q, converges to p in 
case (i), or that it has no limit point in X.{p} if case (ii) holds. But X being 
countably compact in this case, p is the unique limit point of q,, hence it 
must converge to p, as follows immediately from the countable compact- 
ness of X again. 


Corotiary 1 (MA). Any compact space of cardinality less than 27” is 
sequentially compact. 


Proor. Let S C X be any countably infinite set. If S = S, we can obviously 
select a convergent sequence from S. So we have S.S#9. Using the 
Cech-Pospi8il theorem (4.3 of Chapter B.3) there is a p€ S\S with 
x(p, S) < 2°, since otherwise we would have | $|=27" >| X|, a contradic- 
tion. Thus by case (i) of Theorem 1.7 a convergent sequence can again be 
selected from S. O 
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CoroLLary 2 (MA). If X is regular, extremally disconnected, and separ- 
able, then every non-isolated point of it has character = 2”, and if X is also 
countably compact we have (p, X)= 2° for p non-isolated in X. 


Indeed, this follows immediately from Theorem 1.7 and the fact that in a 
regular extremally disconnected space there is no non-trivial convergent 
sequence. As an example of this, we have y(p,8N)=2° for every 
p © BNXN (under MA). 


2. Combinatorial principles valid in L 


While MA has, by now, become an accessible tool for the classical 
mathematicians (especially popular among topologists), the combinatorial 
principles that we are going to illustrate in this section are considerably less 
well known among non-logicians. It is true that they are more recent and 
look more complicated than MA, but we hope their consequences are 
interesting enough to convince the reader about their usefulness. It is an 
interesting phenomenon (also mentioned in Chapter B.6) that these 
principles often work in the opposite direction from MA + “CH. They 
also have an advantage which MA does not have: they easily generalize to 
most higher cardinals. As to their origin and consistency, they are all 
“byproducts” of the (profound) investigations into the “‘fine structure” of 
L, the constructible universe, carried out mostly by R. Jensen (cf. Chapter 
B.5 on L), however their consistency is usually established more easily 
using “‘simple’’ forcing (cf. Chapter B.4 on forcing). 


2.1. The principle © 

We use © to abbreviate the following statement: 

We can simultaneously associate to every a <w, an S, Ca such that 
whenever S C a, the set {a < w,: SM @ = S,} is stationary in w,. We refer 
to the discussion of stationary sets, in Section 2 of Chapter B.3, where it is 
also shown that O— CH and © —SH. 

In what follows we construct, using ©, a small Dowker space with many 
nice properties. The main idea is to ally the construction in OsTAsZEWskI 
[1975] of a hereditarily separable, locally compact, countably compact and 
perfectly normal non-compact space with the basic trick of Rubin [1955]. 
In this latter paper it is shown that the existence of a Suslin tree (i.e. — SH) 
implies that of a small Dowker space. Since O— — SH, our result is not 
too surprising, even if we take into account the nice additional properties 


cu. B.7, §2] COMBINATORIAL PRINCIPLES IN L 511 


that our Dowker space possesses. The reason we are giving it here is to 
publicize the method of construction rather than the result itself. 

Let us note now that the sequence (S,: a < w;) in the definition of © has 
the following property: 


n If A C @, is uncountable, then there is a limit ordinal A < @, 
oy such that S,C A and U S, =A (i.e. S, is cofinal in A). 


Indeed, since A is unbounded, A’, the set of its limit points is closed and 
unbounded. Therefore A'N{a: A Na = S,} 4G, and clearly any A from 
this intersection will satisfy (+). Put L, to denote the set of all limit 
ordinals in w,, moreover put for each A € Li 


Ss if US =a; 
T, = 
A if US <a~. 


Then every 7, is cofinal in A and clearly (T,: A € L,) also has property 
(+). 


2.2. THEOREM (©). There exists a topology T on w, X w with the following 
properties : 
(i) + is locally compact and Hausdorff; 
(ii) 7 is locally countable ; 
(iii) 7 is hereditarily separable ; 
(iv) 7 is normal; 
(v) 7 is not countably paracompact. 


Proor. Let us introduce some notation first. For A € L; put X, =A Xo@ 
and X= U{X,:A €L,} = @1X w. Since OCH, we can arrange all 
countable subsets of X in a sequence {(A,: A © L,), where we may also 
assume that each A, is a bounded subset of X, i.e. that A, C a X w for 
some a < AX. Finally, for every (a,n)€ X define 


B(a,n) = [a x(n + 1)] U {(a, n)}, 
C(a, n) = (wi. a@) X (wv n). 


To define 7, we shall first define by induction on A € L, topologies 7, and 
sets Z, for every A € L, as follows. 

Suppose A & L, and we have already defined a topology 7, on X, anda 
set Z,C X, for each o€ AML, such that the following inductive hy- 
potheses are satisfied: 

(1) 7. is locally compact and T>; 
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(2) if ppc EANL, and p<o then (X,,7,) is an open subspace of 
(Xo, Te); 

(3) for every (a, n)€ X, the set B(a,n) is t,-open; 

(4) for all p,o EA NL, with p <a, Z, is a bounded 7,-clopen subset 
of X,. 

Now if A is a limit of limits (A € L}), then we are forced by (2) to take 
U{r.: o EANL,} as a basis for 7, on X,, and it is quite obvious that 
(1)-(3) will remain valid for 7, as well. Moreover since any Z, is 7.-clopen 
by (4) if ppc EANL:, psa, we get that Z, also remains 7,-clopen. 
(Indeed, this follows immediately from X,.~Z, = U {X,.Z,: 7 EAN Li}. 

If on the other hand A¢ Li, i.e. A =o+wm for some o € L, then we 
have to do some work to define basic neighborhoods for the new points in 
X,.X,. To do that, first arrange the members of the countable family 
{Z,: p €ANML,} in an w-type sequence (Z: n € w), and then consider 
the set T, which by our choice is cofinal in o. Pick a strictly increasing 
cofinal sequence G = {y,: t € w}C T,. For every (a,n)€ X, we have 
then 

|B(a, n)N(G@ x w)| < @, 


hence by (3), the set GX w has no limit point in (X,, 7). But then using 
the fact that (X,, 7.) is a countable locally compact T, space (and therefore 
0-dimensional metrizable) we can select compact open r,-neighborhoods 
K,, of the points (y,s) in G® w which are pairwise disjoint, satisfy 
K,, C B(y, s) for every t,s € w, and also have the property that for any 
ts Ew if (y,s)¢ Z, then K,.N Z =6 for all | < t. This last condition 
can be met since each Z“ is 7,-clopen. 
Now let us partition w into disjoint infinite sets as follows: 


wo = U{arm: n,m Eo}, 


Gum OV An m=O if (n,m) A (n',m’), and |anm|=@ for every n,m Ea. 
Every point of X,.X, is of the form (o+n,m). We define the k-th 
neighborhood V,(o + n,m) of this point by 


Vi(otnm)= U{K,,:t © aumxk and s=m}U {(o+n,m)}. 


It is immediate from this definition that these countable neighborhoods 
together with 7, generate a Hausdorff topology 7, on X, of which (X., 7.) 
is an open subspace. It is also easy to see that every Vi (ao + m,n) is 
compact in (X,,7,) because any infinite subset of it is either covered by 
finitely many of the (compact) sets K,, forming V, (a + m, n), or intersects 
infinitely many of them, and so in either case it has a limit point. We also 
have V,(o + n,m)C B(o + n,m) by the construction. Finally, given any 
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p <a, we have Z, = Z" for some | € w. But Z is bounded in X, and at 
the same time G converges to a, hence for each s€w we have 
{y, s)€ Z for large enough t € w. But by our construction then we also 
have V, (0+ m,n)N Z=® for large enough k, for any fixed m,n € , 
hence no point of X,~X, is a limit point of Z, = Z, which thus remains 
7, -clopen. 

Now it only remains to define Z, for both types of A. To this end 
consider the set A,. If A, is not 7,-closed, simply put Z, = 9. If it is 
7,-closed pick first an a € A with A, Ca@ X w. Since (3) is valid, clearly 
aXw is 7,-open, hence using again the obvious fact that (X,,7) is a 
countable metrizable space we can pick a 7, -clopen set Z, such that 
A, CZ, Ca X w. This completes the induction. 

Let 7 be the topology on X generated by U {7,: A € Li}. We claim that 
(i}+(v) are satisfied by (X,7). (i) and (ii) follow immediately from the 
hypotheses (1) and (2). To see the rest we first establish property (vi) which 
is interesting in its own right. 

(vi) For every gE L, and n€ w we have cl, (T. X {n}) D C(o,n). 

To prove this, note first of all that 


cl, (T. x {n}) D(a + wo) X (wvn) 


is immediate from our construction, since every point (a + m, n’) is in the 
r-closure of Gx {n} C T, x {n} for n'=n. Next we prove by induction 
on the members A of Livo that (A + m,n')€cl,(T, X {n}) if n'=n. We 
have just established this for A = o, now assume A € L,\(o + w) and that 
the inductive hypothesis is valid for A'E L, with o = A'< A. But the same 
argument as for the initial case of o yields us 


(A + m,n’) E cl, (G® x {n}) 
whenever n'=n, moreover Ga is finite by its definition, hence using 
the inductive hypothesis for the members of (G°’.c) x {n} we get 
(A + m,n')E cl, ((G% a) x {n}) C cl, (Tz x {n}), 


as was required. 

Now we first show that (iii) is valid. Indeed, let Y C X be uncountable 
and n be minimal such that Y N[q, x {n}] is uncountable. Then by (+) 
there is o € L, such that T, x {n}C Y. By the choice of n we have that 
Y N[@, X n] is countable, hence using (vi) we get that 


(Y Nf[e@, Xx nJ)U(T, xX {nJU(Y NX.) 


is a countable r-dense subset of Y. 
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To see that (iv) is satisfied let us note first of all that the above argument 
also yields us that any uncountable r-closed subset of X contains a set of 
the form C(o,n). But clearly C(o,n)M C(o', n') 49 for any o,0'€ au, 
n,n’ & w. Hence if H and K are two disjoint r-closed sets in X then at least 
one of them, say H, is countable. Then H = A, for some A € L,, moreover 
Z, is a t-clopen set such that H = A, CZ, Ca Xw for some a <A. 

Now H and K Q X, are disjoint 7, and therefore 1, -closed subsets of X,, 
hence there are disjoint 7,- and therefore t-open sets U and V in X, such 
that HCU and KN X,C V. But then UN Z, and VU(X.Z,) are 
disjoint 7t-open neighborhoods of H and K respectively, showing that 
(X, 7) is indeed normal. 

To see (v), consider the sets F, = C(0,n) = @: X (wn) for each n E w. 
Clearly Fy) D Fi D---, 1) {F,: n € w} =, and each F, is r-closed, hence 
it suffices to show that given any sequence (G,: n € w) of r-open sets such 
that G, > F, for each n€, then M{G,:n € w} 49. In fact we claim 
that for any n € w and any r-open set G, D F, we have [@; X {0}]\G, is 
countable. Suppose, on the contrary that this set is uncduntable. From the 
proof of (vi) we get then that the r-closed set XG, D C(o,0) for some 
ao € a, which however contradicts G, D F, = C(0,n). O 


Remark. The reader familiar with Ostaszewski’s construction (cf. also 
Rubin [1974]) can easily convince himself that without much trouble we 
could have also achieved that the subspace w, x n of (X, 7) be countably 
compact for each n€ a, hence X be o-countably compact. Another 
refinement of the method allows one to construct a nice small Dowker 
space with the aid of CH only (cf. JuHAsz, KuNEN and Rubin [1977]), but 
then local compactness and o-countably compactness have to be aban- 
doned. 
Ideas of this construction will also appear in 2.9. 


2.3. Proposition W 

The combinatorial principle under consideration in this section is really 
an offspring of Jensen’s “‘morass’”’ (cf. Devin [1973]). Its relevance to 
topology is well illustrated by the fact that it was originally designed by J. 
Silver to replace the ‘‘morass’’ in some constructions used to get large 
S-spaces in the constructible universe. In case the reader finds this 
principle W too complicated, he is advised to look up the definition of a 
morass. 

We shall actually generalize somewhat Silver’s W by introducing a 
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cardinal parameter into it. (W = W(w2).) So let « be an infinite cardinal, 
we shall use W(x) to denote the following statement: 

There exists a tree T of height w,+1 with the following properties 
(i)-(iii) (we use T, to denote the a-th level of T and p, to denote the 
function assigning to every t © T, with B = a the (unique) predecessor of t 
at level a): 

(i) | T.,|=« and |T,|<@ for a < a; 

(ii) if s,t € T.,, s# t, then p.(s)# pa(t) for some a < a; 

(iii) there exists a sequence {w,: a <w,} with each w, a countable 
family of countably infinite subsets of T, such that (*) for every A €[T..]° 
there is an a, < @, with p,(A)€ w. if ag Sa <q. 

Now, it is not hard to show that both W(w) and W(w,) are equivalent to 
CH. It is also easy to see that W(«) implies x = 2 because |U {T.: a < 
«,}| = w, and different members of T.,, have different sets of predecessors. 
A standard forcing argument will show that, on the other hand (ZFC + 2” 
is arbitrarily large + W(2™)) is consistent (cf. Chapter B.4). The really 
tough thing is to show that W(#.) = W(2™) holds in L; this was done by 
Silver. 


2.4, Derinition. An infinite subset X C D(2)” (here D(2) is just the 
discrete space on 2 = {0, 1}) is called an HFD-set (short for hereditarily 
finally dense) if the following condition (**) is satisfied: 


For every AE€E[X]* there exists a v4 <<, such that 
(**) whenever e is any finite partial function from ,\v, into 2 
there is fE A with e Cf. 


Informally (**) says that the “‘tails”’ of the functions in A are dense in the 


partial product D(2)*". 


2.5. THEOREM. If W(x) holds, then there is an HFD-set X C D(2)” with 
|[X|=k. 


Proor. Let T and {w.: a < w,} be as required in the definition of W(x). 
Let us introduce the following technical definition: for any S € w, put 


B(S) = min{B: Vy [(8 = y =a) > p,(S)Ew, and p,|S is 1-1}}. 


Clearly, B(S) <a. Now, our aim is to define functions f':a@+1—2 for 
every t€ T, (a <w,) by induction on a@ so that they approximate the 
members of our purported set X. So assume that a < w, and we have 
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already defined the functions f' : 8B + 1— 2, for t © T, and B < a, satisfying 
the following inductive hypotheses: 


if y<B<a and t€Tp, then f' > fr; (1) 


if SG@w, and e is a finite function from a subset of 
(8 +1)\B(S) into 2, then the set S.={tES:eCf'} is (2) 
infinite. 


Next we have to define f' if t € T.. The only problem, of course, is to 
define f‘(a), since for any B <a we have to put 


f'(B) = fre(B) 


if we want to keep (1) valid for a. So let Z(a@) be the family of all infinite 
sets of the form S,, where S € w, and « is a finite function from a\B(S) 
into 2. This makes sense because f‘(8) has already been defined for all 
B <a. Let us note that we have S$,=S for the empty function, hence 
w, C Z(a). Now the family Z(a) is a countable family of countably infinite 
subsets of T,, hence by Bernstein’s well-known theorem we can split T.. 
into two disjoint sets Hf? and H©? so that 


|S.NH®|=o 
for every S. € Z(a) and i <2. Then we define f‘(a) by putting 
: _f0 if te He; 
i {i if te H. 
It is clear from this definition that (2) is also satisfied for a, hence the 


induction goes through. So we can define for each s € T,, a function 
f, € DQ)” as follows 


fi = Uf fre: a < wi}. 


Then it only remains to check that X ={f,: s © T.,}C D(2)™ is indeed 
HFD. To see that let A €[T.,]° be arbitrary and choose a < a, so that 
p.{A is 1-1 and p, (A) € w, for all y = a, which is possible by (*). Now it 
is clear from (1) and (2) that B(p.(A)) will work as va, as required in 
(**) O 


2.6. REMARKS. It is shown in HasNAL and JunAsz [1974] that an HFD 
subspace X of D(2)* is always hereditarily separable and hereditarily 
normal. Now it is also immediate that if for every f € X we have f’ € D(2)° 
which differs from f only in a countable number of coordinates, then 
X'={f': f © X} is also HFD. This enables us to get e.g. S-spaces of 
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cardinality « if W(«), thus showing that it is consistent to have an S-space 
of cardinality 2“ = 27°. The construction we gave above can also be 
modified to obtain HFD subgroups of D(2)™ (of size x), thus yielding e.g. a 
countably compact, hereditarily normal and hereditarily separable non- 
Lindeléf topological group (cf. Hasna and JunAsz [1976a] for the case 
W(w,) = CH). 

There is also an analogous construction for getting L-spaces of weight « 
from W(x), where an L-space is one which is regular, hereditarily Lindel6f 
but not separable (a dual of S-spaces, cf. HAmNAL and JuHAsz [1974]). 

Let us note finally, that (MA+—7CH) implies that there are no 
HFD-sets at all. This can be seen e.g. by noting that according to Corollary 
1 of Theorem 1.7 D(2)* is sequentially compact, while of course a 
convergent sequence can never be dense “‘in a tail’’. 

Thus, W “works against” MA + —CH as we have seen with ©. 


2.7. The next combinatorial principle that we want to illustrate is called D,, 
however we shall not need O, proper (the interested reader is referred to 
Chapter B.5) because we only use the following consequence of it that we 
denote by E(x): There exists a subset E C x such that 

(i) aE E>cf(a)=o; 

(ii) E is stationary in «; 

(iii) for each a<x« the set Ea is not stationary in a. 

E(w) is a trivially true statement but E(w.) fails in some models of set 
theory. However Jensen has shown that V = L implies E(x) for ‘‘most”’ 
regular cardinals « (cf. Devin [1973]). 

An application of E(«) is given in HAsNAL and JunAsz {1976b], where it 
is shown that the E of E(«) as a space of ordinals has the property that 
every subspace of E of size less than « is metrizable, while E is not. 
Instead of repeating the arguments there we chose to give a more 
complicated example, which of course has a number of nice additional 
properties. In order to achieve that however we shall also need a 
generalized version of © which reads as follows: 


2.8. DEFINITION. Let « be an uncountable regular cardinal and E C « be 
any stationary set in x. Then ©, (E) denotes the statement that there is a 
sequence {S,: a € E} such that S, C a and for every X Cx the set 


{aE E: XNna=S,} 


is stationary in x. Then © of 2.1 is just ©.,(@1). It is again a result of Jensen 
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that if V =L, then ©, (E) is always true whenever « > w is regular and E 
is stationary in x (cf. Devin [1973]). 

We will actually need the following consequence of ©, (E) that follows 
easily by means of a ‘“‘coding” argument: 


There is a sequence {(S., T.): a € E} such that S., T. Ca 
(*) and for any X,YC« the set {faG€ E: XNa=S, and 
YNa=T,} is stationary in x. 


Now we can turn to the formulation and proof of the above mentioned 
result. 


2.9. THEOREM. Suppose E witnesses E(x) and ©, (E) holds. Then there is a 
topology + on E which is 
(i) locally countable ; 
(ii) locally compact, Hausdorff ; 
(iii) normal; 
and for which 
(iv) every subspace of E of size less than « is metrizable, but (E, 7) is not. 


Proor. To construct the desired topology + we define topologies 7, on 
E Ma for each a € E by induction as follows. (Keep in mind that every 
a €E is an w-limit!) Suppose a € E and that for each B € EN a and 
each y€ ENB we have already defined a topology 7, on ENB anda 
decreasing sequence of sets {V.(y): n € w} (this does not depend on B) so 
that the following conditions are satisfied: 

(1) 7, is a metrizable topology; 

(2) {V.(y): 2 € w} is a neighborhood basis for y in (EN 8, t~); 

(3) each V,(y) is countable, compact and open in (E /¢ B, 72); 

(4) the sequence y, = min V,,(y) converges to y in the usual topology of 
ordinals. 

Now, we have two cases to consider. First, if a has no immediate 
predecessor in E, then we have no new V,(y7) to define and 7, has to be 
the topology on EfNa generated by U {r,: 8 <a}. Let us note that 
(EQ y, t,) is always an open subspace of (EM B, ts) for y < B <a, as 
follows easily from (2), hence of (E O a, 7.) too. Therefore (2), (3) and (4) 
are automatically valid for a as well, and thus the only thing left to prove is 
that 7, is metrizable. To see that, let us put a* = U (EM a) and choose a 
closed unbounded subset CC a* with CN E=9, which is possible 
because EN a= EN a™* is non-stationary in a*. Then a*\C can be 
written as a disjoint union of maximal open intervals: 
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a* C= U {(a, Bi): ie€ J}, 


where a, B; © C. We claim that for each i € I the set (a, B;) N E is open in 
(E Na, t.). Indeed, if y € E N (a, B;) then by (4) there is n € w for which 
V,.(y) C (a, y] C (a, B;). But then (E ON a, 7.) is the topological sum of its 
subspaces E (a, B;), which are metrizable by (1), hence so is 7,. 

Next assume that f is the immediate predecessor of @ in E. Then we first 
have to define the sets V,(8) which will form a neighborhood basis of B in 
(E Na, 7.). (Note that EN a = (EN B) U{B}.) To this end we again have 
to distinguish two cases. 

Case a. U(EM8)=8 and Sz and Tz are disjoint unbounded subsets 
of EB. Then we choose two sequences A ={o,:n€w}C Sg and 
B={p,: n € w}C Tz, such that o, 7 B and p, 7 B. Clearly then A and B 
are closed discrete subsets of the (metrizable) space (E M B, 7,), hence we 
can put pairwise disjoint neighborhoods about the points of A U B, say 
V,,.(o,) and V,,(p,), where we can also assume, by (4), that 


On-1< min Vi,(o,) and p,-1< min V,,(p,) 


for each n € w {0}. 
Now we can define the neighborhoods V,, (8) by putting 


Vn (B)= U{Vi,(on) U Vi,(pn): 2 E oxm} U {BR}. 


Clearly V,, (8) is then countable and V,, (8) (E/N B) is open in 7, hence 
we indeed get a topology 7, on EN a =(EN B) U{B} taking the V,, (8) 
as an open neighborhood basis of B. That each V,, (B) is also compact in 7, 
is proved easily from the inductive hypotheses, just like in the proof of 2.2. 
Moreover £,, = min V,, (8)=min{on-1, Pm-1} for each m>0O, hence 
Bm — B, Showing that (2}-(4) are satisfied for a. Finally the metrizability of 
Ta follows from the fact that it is first countable and regular (being locally 
compact and Hausdorff) and throwing away a single point from E Na, 
namely B, we get a metrizable subspace, namely (E/N B, tg) (cf. HAINAL 
and JunHAsz [1976b]). 

Case b is when case a does not hold. Then we simply add 8 as an 
isolated point to get 7., and put V,(8) = {8} for each n € o. It is trivial 
then that conditions (1)-(4) will be satisfied. 

Thus we can define r, for each a € E and then put U {r.: a € E}asa 
basis for the topology 7 on E. Clearly then (i) and (ii) hold, moreover every 
subspace of (E,7) of cardinality less than «x is a subspace of some 
(E Na, t) and hence is metrizable. 

To see that (E, r) is normal, let H and K be any two disjoint closed sets 
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in it. If both are bounded, they are both contained in an open metrizable 
subspace (E M a, 7.) hence can be separated, trivially. They cannot both be 
unbounded though, because suppose they were, then the set 


H'NK'N{BEE: HNB=S, and KNB = Tp} 


would be non-empty by (*) (here of course H’ and K’ denote the sets of all 
ordinary limit points of H and K, respectively), and by our construction 
any such B is in the 7-closure of both H and K. So we can assume e.g. that 
H is bounded and K is not. Now let v € x be an w,-limit such that H C ». 
If a is the smallest member of E.v, then v <a, moreover EN v= 
ENa2H. Then H and K Na are disjoint closed sets in (E Na, tz), 
hence can be separated by 7.-open, hence r-open, sets U and V. On the 
other hand, for each B € K\a we have v < B, hence we can choose kg 
such that V,,(8)C Ex», and then the open sets U and 


VU U{V¥,,(B): BE Kva} 


clearly separate H and K, hence we have (iii). 

Finally, we claim that J={a €E:qa@ is not isolated in (E,7)} is a 
stationary subset of x. Indeed, by our construction, for any two disjoint and 
unbounded subsets S and T of E we have 


JOS'NT'N{aEE: SNa=S, and TNa=T,}, 


so J contains a set which is stationary by (*). Knowing this about J we 
show that (E, r) is not even metacompact, hence is not metrizable. Indeed, 
the family {E (a + 1): a € E} is an open cover of (E, 7) and if ¢ is any 
open refinement of it then for each a € J there is n € w with V,(a)C G, 
for some G, € Y Then f(a) = min V,(a)<a, hence f is a regressive 
function on J (cf. Theorem 2.3 of Chapter B.3) so by Neumer’s theorem 
there is SCJ, |S|=« and BEE such that f(a)=8 for all aE S. But 
clearly each G, is bounded in x, hence 


|{G.: a € S}| = kx, 
and also 
BE N{G,.: aE S}FO 


so @ is not point-finite. Thus (iv) is also satisfied and the proof is 
completed. 1 


Finally we are going to look at some applications of Kurepa trees. Let us 
recall (cf. Chapter B.4) that a tree of height w, is called a Kurepa tree if it 
has countable levels and more than w, branches. Thus e.g. for any k > o1 
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the tree T.T., of W(«) is always a Kurepa tree. It is shown in Chapter 
B.4 that the existence of Kurepa trees is both consistent with and 
independent of the usual axioms of set theory. 

The first application, due to K. Kunen, is concerned with the following 
situation. Suppose we have CH, i.e. 2° = w., but 2° > w2. Now it is well 
known that a compact space of countable weight is either countable or has 
cardinality 2°, independently of CH. Is it true then that in the above 
situation the cardinality of a compact space of weight =, is either = o, 
or equal to 2“? It turns out that the answer to this question can be both 
“yes”? and ‘“‘no’’, depending on your set theory. We shall illustrate one of 
these answers by proving the following theorem. 


2.10. THEOREM. Suppose 2° = w,, 2° > w2, and T is a Kurepa tree with 
exactly w, branches. Then there is a compact space of weight w, and of 
cardinality w». 


Proor. We claim that the subspace X of D(2)’ consisting of the charac- 
teristic functions of all connected chains of T is as required. (By a 
connected chain we mean a chain C such that t€ C and s<t implies 
s € C as well.) Since not being a connected chain is clearly always decided 
by two members of T, the complement of X in 27 is open, hence X is 
compact. |X| = 2 holds because there are only w, bounded connected 
chains in T by CH, and there are exactly w,; unbounded ones, by 
assumption. Finally, w(X) = @, is trivial from w(D(2)7)=|T|=o.. O 


Our last result anwers a question raised by R. Sikorski in 1950, namely, 
does there exist a Lindel6f w:-metrizable space of cardinality >, (cf. 
Stkorsk! [1950], p. 132)]. W. Weiss and the present author have recently 
shown that the complete answer to this (really very natural) question is 
given by a strange tree. 


2.11. THEOREM. There exists a Lindelof w,-metrizable space of cardinality 
>w, if and only if there is a Kurepa tree with no Aronszajn subtree. 


We shail not give the proof of this result here, it will appear in JUHAsz 
and Weiss [1977]. We only mention that V=L implies the existence of 
such trees, as was shown by R. Jensen (cf. DEVLIN [1974]), while of course it 
is consistent that there are no Kurepa trees at all (cf. Chapter B.4). 
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Strictly speaking, recursion theory is the study of the class of recursive, 
or effectively computable, functions and their applications to mathematics. 
A broader interpretation is when recursion theory is taken to mean the 
study of the general process of definition by recursion, not just on natural 
numbers but on all types of mathematical structures. The first four chapters 
of this Part fit under the narrow definition, the last four under the broader 
one. 

The class of recursive functions is defined and studied in Enderton’s 
introductory chapter. This chapter discusses the arguments for the identifi- 
cation of this class with the “‘effectively calculable’? functions (Church’s 
Thesis). It also introduces the reader to many of the ways that the basic 
notions can be applied. 
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The next three chapters take up in depth topics introduced in Enderton’s 
chapter. Martin Davis’s chapter pursues the uses of the theory of recursive 
functions for showing that-certain classes of problems cannot be effectively 
decided — the word problem for groups being one of the best known. The 
related problem of decidable versus undecidable theories of first-order 
logic is discussed in Rabin’s chapter. 

Among undecidable problems, some are more undecidable than others. 
The definition of “degree” of unsolvability is introduced in Section 8 of 
Enderton’s paper and a survey of important results on these degrees is 
given in Simpson’s chapter. 

Moving to the broader definition of recursion theory we come to Shore’s 
chapter on the generalization of recursion theory to admissible ordinals. 
Shore presents a fine introduction to the basic notions and, as a case study, 
shows what new considerations arise when the Splitting Theorem is 
generalized to admissible ordinals. The chapter also contains a very useful. 
annotated bibliography to the study of a@-recursion theory. 

The study of Kleene recursion in higher types (recursive functions of 
functions of functions, say, rather than recursive functions of natural 
numbers) has always been more or less inaccessible to all but the dedicated 
specialist — due to the difficulty of the basic papers in the subject. This 
situation should be remedied in the chapter by Kechris and Moschovakis, 
where a conceptually simple approach via inductive definability is taken. 

The study of inductive definitions in general is taken up in Aczel’s 
chapter. It should interest logicians of all persuasions since it combines the 
concerns of the recursion-theorist with the vantage points of the model- 
theorist and proof-theorist. 

Martin’s chapter discusses one of the major applications of recursion 
theory — to descriptive set theory. Here definability considerations over 
the continuum give rise to a beautiful theory which finds its origins in the 
French “constructivist” school of Borel, Baire and Lebesgue. 

We had planned to have a chapter on the more “‘practical” aspects of 
recursion theory, those where running times of programs and computa- 
tional complexity appear, but this chapter did not materialize. Among 
other chapters of the Handbook relevant to recursion theory we mention 
Statman’s chapter on the equation calculus (in Part D), and Makkai’s 
chapter on admissible sets (in Part A). The recursion-theorist might also be 
interested to see some proof-theoretic applications of recursion theory in 
Feferman’s chapter in Part D. 
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Introduction 


This chapter presents an expository treatment of the elements of 
recursive function theory. It makes no claims of advancing to the frontiers 
of research in this field. It does attempt to indicate what background would 
be required of someone heading that way. 

The proofs in this chapter are often merely sketched, with indication of 
the main ideas involved. There are several books that give more thorough 
treatment to these topics. The primary reference in this field is ROGERS 
[1967]. A more condensed treatment can be found in Chapters 6 and 7 of 
SHOENFIELD [1967]. Turing machines are discussed, among other places, in 
books by YASUHARA [1971] and by Davis [1958]. There is a fairly recent 
book on degrees of unsolvability by SHOENFIELD [1971] and an older one by 
Sacks [1963]. Finally, the classic book by KLEENE [1952] contains much 
recursion theory. 


1. Informal computability 


The simplest conception of recursive functions is as ‘effectively comput- 
able’’ functions. We will consider initially functions from natural numbers 
to natural numbers, postponing the matter of functions on other sets. Let N 
be the set {0,1,2,...} of natural numbers, and let N* be the cartesian 
product NXNX---xXN with k factors. Then the objects we want to 
consider will be functions f with domf C N« for some positive k and 
ranf CN. Such an object will be called a k-place partial function. The 
word “‘partial’” is a reminder that the domain is only a subset, possibly 
proper, of N*. (The partial function is said to be total if its domain is all 
of nN‘) 

It is clear from cardinality considerations that there are 2”° k-place 
partial functions for each positive k. From this huge inventory we want to 
select the Ny functions that are recursive. We begin in this section with an 
intuitive description of the notions we seek to capture. And then in the 
next section we will turn to the methods for making the ideas precise. 

Call a k-place partial function f effectively computable when there exists 
an effective procedure (i.e., an algorithm) that calculates f correctly. Now 
an effective procedure must meet the following criteria. 

(i) There must be exact instructions (i.e., a program), finite in length, for 
the procedure. These instructions cannot demand any cleverness or even 
understanding on the part of the person or machine following them. 
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Executing the instructions must be a matter of merely following directions 
carefully. 

(ii) If the procedure is given a k-tuple x in domf, then after a finite 
number of discrete steps the calculation must terminate and produce f(x). 

(iii) If the procedure is given a k-tuple x that does not belong to dom f, 
then the procedure might go on forever, never halting. Or it might get 
stuck at some point, but it must not pretend to produce a value for f at x. 

One can picture an industrious and diligent clerk, well supplied with 
scratch paper, tirelessly following his instructions. Alternatively, one can 
picture an automated version, a digital computer executing a program. 

Despite the fact that we have given only a suggestive description and not 
a mathematical definition, it is possible to develop nearly all of the theory 
of recursive functions on just this informal basis. (The recursive functions 
are the effectively computable functions, but we reserve the term ‘“‘recur- 
sive’ for the mathematically defined concept.) For evidence of this 
possibility, we refer the reader to the book RoGcers [1967]. 

As examples of effectively computable functions we can cite addition 
and multiplication of natural numbers. Effective procedures for these 
functions (using decimal representation) are taught in the elementary 
schools. Any function with a finite domain is effectively computable. The 
instructions for computing such a function can contain a table listing all of 
its values. 

There are several sorts of restrictions of a practical nature that we do not 
impose on effective procedures. 

(i) Although each argument given the procedure as input must be a 
(finite) natural number, there is no bound imposed in advance on the size 
of the arguments. We do not rule out arguments that exceed the number of 
electrons in the universe, for example. 

(ii) Although the procedure must produce f(x), when x € dom f, after a 
finite number of steps, there is no bound imposed in advance on this 
number. 

(iii) Similarly, there is no bound imposed in advance on the amount of 
scratch paper (memory space) the procedure might require. Even multipli- 
cation of very large numbers can require large amounts of scratch paper. 

These considerations are relevant to the comparison of effective com- 
putability to ‘‘practical computability”. A person with a digital computing 
machine may regard a function f as being computable only when f(x) is 
computable on his machine in a reasonable length of time. Of course, the 
matter of what is reasonable may change from day to day. And next year 
he hopes to get a faster machine with more memory space and tape drives. 
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At that time, his idea of what is computable in a practical sense will be 
extended considerably. 

The class of effectively computable functions is obtained in the ideal case 
where all of the practical restrictions on running time and memory space 
are removed. Thus the class is a theoretical upper bound on what can ever 
in any century be considered computable. 

It should be clear that if f and g are functions that agree at all but finitely 
many arguments, then f is effectively computable iff g is also effectively 
computable. Thus the question whether a function is effectively comput- 
able hinges solely on the behavior of that function in neighborhoods of 
infinity. 


2. Turing machines 


There are many equivalent ways of formulating the definition of 
recursiveness. A version phrased in terms of imaginary computing 
machines was given by the English mathematician Alan Turing in a 
fundamental paper (TurinG [1936]). (Related work was done simultane- 
ously but independently by Emil Post in New York; see Post [1936].) 
Turing had the disadvantage of formulating this definition prior to the 
development of actual digital computers. In fact the flow of information 
was from the abstract to the concrete: von Neumann was familiar with 
Turing’s work, and Turing himself later played an enthusiastic role in the 
development of computers. 

On an informal level, we can begin by picturing a Turing machine as a 
black box together with a tape. The tape is marked off into squares, and 
each square can contain either the blank symbol 0 or the non-blank symbol 
1. The tape is potentially infinite in both directions, in that we never come 
to the end of it, but at any time only finitely many squares can be 
non-blank. Initially the tape contains the input numbers, and ultimately it 
contains the output number. At intermediate times it serves as memory 
space for the calculation. 

If we open up the black box, we find that it is a very simple device. It is 
capable of examining only one square of the tape at a time. The device 
contains a finite list of instructions (or states) qo, q,,-.-;4,- Each instruc- 
tion can indicate two possible courses of action, one to be followed if the 
tape square under scrutiny contains a 0, the other to be followed if it 
contains a 1. In either event, a course of action can only consist of the 
following three steps: 
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(i) A symbol (possibly the same as the old symbol) is written on the 
tape square being scanned, thereby erasing the previous symbol. 

(ii) The tape is moved one square right or left. 

(iii) The next instruction is specified. 

Thus the list of instructions determines a transition function that, given 
the number of the present instruction and the symbol being scanned, 
produces the three-part course of action. We can formalize these ideas by 
taking the Turing machine simply to be this transition function. 


2.1. DEFINITION. A Turing machine is a function M such that for some 
natural number n, 


dom M C {0,1,..., 2} x {0, 1}, 
ran M C (0, 1} x {L, R} x {0,1,..., n}. 


For example we might have M(3, 1) = (0, L, 2). The intended meaning of 
this is that whenever the machine comes to instruction q, while scanning a 
square in which 1 is written, it is to erase the 1 (leaving a 0 in the square), 
move the tape so as to examine the square just to the left of the present 
square, and proceed next to instruction q>. If M(3, 1) is undefined, then 
whenever the machine comes to instruction q, while scanning a square in 
which 1 is written, it halts. (This is the only way of stopping a calculation.) 

This intended interpretation is not embodied in the formal definition of a 
Turing machine. But it does motivate and guide the formulation of all 
subsequent definitions. In particular, we can define what it means for a 
machine M to move (in one step) from one configuration to another. We 
do not need to present the formal definitions here, since they are only 
translations of our informal ideas. 

The input/output format consists of strings of 1’s, separated by 0’s. Let 
'x! be a string of 1’s of length x + 1. Thus 


'x,!0'x,'0---0!x,) 


is the result of combining k strings of 1’s, each separated from the next 
by a 0. 

At last we can define recursiveness. A k-place partial function f is said to 
be recursive if there exists a Turing machine M such that whenever we start 


M at instruction q scanning the leftmost symbol of 
Ty] 0 Le 0 ee 0 tx, ! 


(with the rest of the tape blank), then: 
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(i) If f(x,,...,x,) is defined, then M eventually halts scanning the 
leftmost symbol of 


Gets ke) 


and with the tape blank to the right of this string. 

(ii) If f(x,,...,x,) is undefined, then M never halts. 

If R is a k-ary relation on the natural numbers, then R is said to be 
recursive if its characteristic function yg : N“ —> {0, 1} is recursive. (Caution: 
If a k-place partial function is recursive, then it does not follow that its 
graph is a recursive (k + 1)-ary relation.) 

For example the identity function f(x) = x is recursive, being computed 
by the empty machine. A less trivial case is addition x+y, which 
is computed by the machine whose values are listed in Table 1. The 
comments to the right are to help the reader, not the machine. (Turing 
defined M to be a set of quintuples instead of a function from pairs to 
triples. The table, being the graph of M, is essentially a set of quintuples.) 


Table 1 


01 1R0 pass over x 

00 1R1 fill gap 

11 1R1 pass over y 

10 OL2 end of y 

21 OL 3 erase a 1 

31 OL4 erase another 1 
41 1L4 back up 

40 ORS halt 


It is an exercise in programming to produce Turing machines for 
multiplication and exponentiation. 

We should remark that many of the details of our definition of a Turing 
machine are somewhat arbitrary. If there were more than one tape, the 
class of computable functions would remain unchanged (although some 
functions could be computed more rapidly). Similarly we could allow more 
than the symbols 0 and 1. Or we could have the tape extend in only one 
direction from a starting point, instead of both directions. None of this 
affects the class of computable functions. What is essential in the definition 
is the provision for arbitrarily large amounts ‘‘scratch-pad”’ storage space 
and arbitrarily long calculations. 
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We can give a specific example of a non-recursive function by using the 
“busy beaver competition” of Rapo [1962]. An n-state entry in this 
competition is a Turing machine M with n + 1 instructions qp,...,q,, the | 
last of which is used only for halting (both M(n,0) and M(n,1) are 
undefined), and such that when started on a blank tape, M eventually halts. 
When M does halt, its score in the competition is the number of 1’s on the 
tape. Thus the machine tries to write as many 1’s on the tape as it possibly 
can, but it must halt. Let 3(n) be the maximum possible score for an 
n-state entry. 


2.2. THEOREM (RADO [1962]). The function & is not recursive. In fact for any 
total recursive f on N, we have f(x)< (x) for all sufficiently large x. 


Proor. The function whose value at x is 
max[f(2x + 2), f(2x + 3)] 


is recursive, and hence is computed by some machine M having, say, k 
instructions. For each x, consider a machine N, that writes 'x! on a blank 
tape and then behaves like M. Then N, is a (x + k + 2)-state entry in the 
busy beaver competition. So its score (the number displayed above plus 1) 
is bounded by 3(x+k+2), which for all x =k is bounded by 
3(2x+2) O ~ 


We will construct other non-recursive functions later, but the 5 function 
has a striking simplicity. The first few values of } are known: 3(1)= 1, 
2(2)=4, and 3(3)=6 (Lin and Rapo [1965]). Next 3(4)=13 (BRADY 
[1966, 1975]). Beyond this point, only lower bounds are known. 3 (5) = 17 
2(6)=35, 5(7)= 22961, and ¥(8)>8x 10" (Green [1964]). 


3. Church’s thesis 


In Section 1 we discussed an informal concept of computability. In 
Section 2 we defined the mathematical concept of recursiveness. Do these 
two match? That is, is the concept of recursiveness the correct formaliza- 
tion of our intuitive concept of effective computability? The claim that it is 
indeed correct is known as Church’s thesis. This claim was advanced and 
defended by Cuurcn [1935, 1936], and has been almost universally 
accepted. 

There are two arguments supporting the view that the class of recursive 
functions is broad enough to contain all effectively computable functions. 
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The first has to do with specific procedures: the procedures that have been 
felt to be effective have, when examined, been found to be executable by 
Turing machines. The second argument has to do with the class of effective 
procedures as a whole: the several attempts that were made to formalize 
the concept of computability have all yielded concepts equivalent to 
recursiveness. In particular, the natural ways of liberalizing the definition 
of recursiveness (such as allowing several tapes) in the end yield notions 
equivalent to recursiveness. (The proofs of these results are, for the most 
part, not difficult once the techniques of the following section are known). 

Historically, the first appearance of a definition of recursiveness was in 
Kurt Gédel’s original paper (GOpeEL [1931]) on the incompleteness of 
formal systems. He defined a relation to be entscheidungsdefinit if it was 
binumerable in a certain formal system of number theory. (The concept of 
binumeration may be found in Chapter D.1.) This is equivalent to our 
definition of recursive relation. But Gédel was not at this time attempting 
to formalize the concept of effective decidability or computability, and 
attention was not focussed on this definition. In the same paper he defined 
a class of functions called ‘‘recursive”’ (rekursiv); this is now called the class 
of primitive recursive functions. The name ‘“‘recursive’’ was appropriate, 
since the central feature of the definition was a provision for finding 
f(n+ 1) from f(n). 

Gédel visited Princeton several times in the 1930’s, before moving there 
permanently in 1940. In 1934, during one of these visits, he gave a talk for 
which mimeographed notes were circulated. The notes were taken by 
Kleene and Rosser, who about this time completed dissertations at 
Princeton under Church. (The notes were eventually published as GOpEL 
[1965].) In this talk he raised the issue of effective computability. He noted 
that more general forms of recursion would have to be admitted before his 
previous recursive functions could include all computable functions. He 
then defined a class he called ‘“‘general recursive functions”, using ideas 
that had been suggested to him in a letter from Herbrand. (This definition, 
which involved formal rules for deriving equations from others, is also 
equivalent to our definition of recursiveness.) 

Church had been at Princeton since 1929 and together with his student 
Kleene had developed the concept of A-definable functions. The question 
of the relationship between A-definability and effective computability was 
studied by Church. CuurcH [1936] not only contained the proposal now 
bearing his name, but also provided the first example of an unsolvable 
decision problem. This is the problem whether a formula in the A -calculus 
has a normal form, which can be regarded as a decision problem in 
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elementary number theory. The proof of the equivalence of the concept of 
A-definable functions and the concept of general recursive functions was 
due primarily to KLEENE [1936b]. 

TuRING [1936] referred to Church’s paper, and presented yet another 
definition of recursiveness (essentially the definition of Section 2). Turing 
had independently had the idea of formalizing the concept of effective 
computability, but was led to publish only when Church’s paper appeared. 
In an appendix, Turing proved the equivalence of his definition to 
A-definability. 

Post [1936] described Church’s thesis as being not a definition or an 
axiom but a natural law, a “fundamental discovery” concerning “the 
mathematicizing power of Homo Sapiens’, in need of “‘continual verifica- 
tion”. 

Until now we have dealt with functions as the basic objects of study; we 
have made scant reference to k-ary relations on N. Actually recursion 
theory can be developed in terms of either functions or relations, and with 
interchangeable results. We can, informally, call a relation R decidable if 
there is an effective procedure that, given any x, replies “‘yes” if x € R and 
replies ‘“‘no” if x € R. (In discussing relations, we will write x © R and 
R(x) synonymously.) Then R is decidable iff the characteristic function of 
R is effectively computable. Thus a consequence of Church’s thesis 
(equivalent in fact to the original form) is that the concept of a recursive 
relation is the correct formalization of the informal concept of a decidable 
relation. 


4, Universal machines and normal form 


The initial application of recursive functions was to prove incomplete- 
ness theorems of logic. For that purpose, no deep results on the internal 
structure of the class of recursive functions are required. And in fact a 
more restricted class, such as the primitive recursive functions mentioned 
in the preceding section, would suffice. 

But as will be seen, there are other applications for recursive functions. 
And if for no other reason, the recursive functions would be studied for 
their own interest as the effectively computable functions. And the basic 
fact that gets such a study off the ground is the possibility of encoding 
machines into integers that can then be supplied as input to other (or the 
same) machines. 

There is a direct analogy here with actual digital computers. The earliest 
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such computers were programmed by setting switches and inserting wires 
into plugboards. It was then realized (by von Neumann) that for a suitably 
constructed computer, the program could be coded into machine words 
(i.e., integers) and stored in the machine in the same manner as data — the 
stored-program computer. The first practical benefit of this approach to 
programs is the speed at which new programs can be loaded into the 
computer to replace old programs. But a more significant benefit (for our 
purposes) is the possibility of executive programs, e.g. operating systems. 
An executive program accepts another program as incoming data. The 
executive program might then study the incoming object program and see 
that its instructions are carried out. 

The ideas behind stored-program computers can be carried over to 
Turing machines. (Historically it was the other way around.) A Turing 
machine M might be given two numbers as input, one of them a suitable 
encoding of Turing machine N, and the other a number x. Machine M 
might then serve as an executive program, and the output might be just the 
result of applying N to x. M can then be called a universal Turing machine. 

Carrying out these ideas and constructing a universal Turing machine 
turns out to be a straightforward (if somewhat lengthy) procedure. We will 
outline how it goes. First of all, each Turing machine is a finite object, and 
so can be encoded as a natural number under some fixed encoding. We can, 
for example, define the encoding 


Xotlaxytl | oxytl 
203 Pe 


(Xo. X15-++5%Xn) = 


. In powers of primes as a way of condensing a finite string of numbers to a 
single number. We then need a decoding function (x), with the property 
that for isn, 


((X9, X45 ++ +3 Xn))i = Xe 


Turing machines can be found to effect the above encoding, and inversely 
to do the decoding. 

At any point in the history of a Turing machine calculation, the entire 
configuration of the machine (the tape contents, the instruction number, 
and the square being scanned) can be described by a finite amount of 
information, and so can again be encoded into a number, called an 
instantaneous description. Then a computation record for machine M is a 
number encoding a finite sequence of instantaneous descriptions meeting 
the following conditions. 

(i) The first instantaneous description specifies instruction qo (the ‘‘ini- 
tial state’’). 
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(ii) The last one has the machine at an instruction q,; and scanning a 
symbol s for which M(i, s) is undefined (a “‘halting configuration’’). 

(iii) Each one is related to the next in that M, when in the configuration 
given by one instantaneous description, moves in one step to the configura- 
tion given by the next. 

Thus a computation record is a natural number that encodes the entire 
history of one calculation by M, from its initial state qg (presumably with 
some input of interest on the tape) until it halts. 

All this encoding would be pointless were it not for the fact: the results 
of the encoding are recursive functions and relations. That is, in going 
through the details of this encoding, one can verify at every step that 
Turing machines exist to handle the concepts involved. In the end one has 
the following two results. 

(i) There is a.recursive ternary relation T that holds of e, (x,,..., X,), 
and y iff e encodes a Turing machine and y is a computation record for 
that machine, starting with Ty) 0 Iy,10-- ‘0 x, ! on the tape. 

(ii) There is a recursive function U such that whenever 
T(e,(x;,.--,%,), y) holds, then U(y) — the upshot of y — is the output 
value of the calculation (provided the halting configuration is such that this 
makes sense). 

Even without the details, it should appear that T is intuitively decidable 
and U is computable. And so one would expect them to be recursive; that 
expectation is correct. Next we define, for each k, the k-place partial 
function 


{e}“(x,,...,x,) = U[the least y such that T(e,(x,,..., x), Y)]- 


Here we write “‘the least y”’ although it is quite possible that no such y 
exists; if there is no such y then the function is undefined at that point. We 
abbreviate all this by the letter uw: 


fe} (x1, ..., Xe) = U(uy Te, (1, --- Xk) Y))- 


(The notation {e}* is Kleene’s; the notation eo? is used by Rogers. The 
superscript k is omitted whenever possible.) 

We can now conclude the following fundamental theorem. The theorem 
in this form is due to KLEENE [1936a, 1943], but universal Turing machines 
appeared in the original paper by Turina [1936]. 


4.1. Normal Form Theorem 
(i) The (k +1)-place partial function whose value at (e,x,,...,x,) is 
{e}* (x,,...,%,) is recursive. 
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(ti) For each e, the k-place partial function {e}* is recursive. 
(iii) Every k-place recursive partial function equals {e}* for some e. 


The number e will be called an index of {e}*. Thus a partial function is 
recursive iff it has an index. 

We want next to use this work to prove the unsolvability of the halting 
problem. Suppose we give input x to machine M and start it running. After 
the first million steps, we might become suspicious that it will never halt. 
On the other hand, maybe if we have just a little more patience, it will halt 
after a few more steps. Is there any way to test which of these two situations 
we are in? No, there is not. There is no effective procedure that, given M 
and x, will decide whether or not this calculation ever terminates. This is 
the content of the theorem below. For a partial function f, we write 
f(x)<© to mean that f(x) is defined. 


4.2. THEOREM (unsolvability of the halting problem). Neither {(x, y): 
{x}(y)< ©} nor {x: {x}(x)< ©} is recursive. 


Proor. Our description of this proof (and others) relies on the reader’s 
informal ideas of effective computability, but it can be translated into a 
rigorous description involving Turing machines. 

Let K ={x: {x}(x)<}. It suffices to show that K, the diagonal of the 
halting problem, is not recursive. We use a classical diagonal argument. 
Consider the function 


{x}(x)+1 if x EK, 


g(x) = 
0 if x € K. 


The function g is total, but it cannot be recursive (because g(e) ¥ {e}(e) 
for each e). But if K were recursive, then g would be. So K is not 
recursive. O 


In the foregoing sections, we have been totally indifferent to questions 
regarding just how long it took a Turing machine to compute a function 
value. But now suppose we examine ®,(x), the number of steps the 
machine with index e uses in computing {e}(x). For any recursive function, 
we have the choice of infinitely many different machines to compute it. 
Some recursive functions are so stubborn that any available machine takes 
almost forever: 
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4.3. THEOREM (RABIN [1960]). For any total recursive H on N, we can find a 
total recursive F :N—> {0,1} such that for any index e of F, 


®, (x) > H(x) 
for all sufficiently large x. 


The proof involves gradually detecting indices of fast machines, and 
defining F so as to disagree somewhere with the result of such machines. 

A stronger result is the speed-up theorem, which indicates that a 
function need not have any fastest index, or even any almost fastest index 
for any reasonable meaning of ‘‘almost’’. 


4.4, SPEED-UP THEOREM (BLUM [1967]). For any total recursive function G 
on NXN, we can find a total recursive F : N—> {0, 1} such for each index i of 
F there exists another index j of F such that 


G(x, B(x)) < B(x) 
for all sufficiently large x. 

For example, take G(x, y)=2”. Then for any machine computing F, 
there exists another machine exponentially faster for almost all inputs. The 
theorems of Rabin and Blum are actually more general than we have 
indicated. In these theorems ®,(x) can be any reasonable measure of the 
complexity of computing {e}(x), subject only to some very modest 
assumptions. 

For any total recursive function L on N we can define the complexity | 
class C, of functions almost always computable in a number of steps 
bounded by L: 


C, = {F: for some index f of F, ®,(x)= L(x) 
for all sufficiently large x}. 


These complexity classes organize the recursive functions according to 
computational difficulty. In particular, we can say that F is no harder to 
compute than G if F belongs to every complexity class to which G 
belongs. This happens iff for every index of G there exists an index of F 
that is almost always just as fast. 


5. Oracles and functionals 


Three years after his original paper (TurinG [1936]), Turinc [1939] 
introduced an extension of his concept of computability. Imagine a 
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computing agent (an industrious clerk or a machine) provided, as usual, 
with explicit instructions and plenty of scratch paper. But in addition we 
now provide a new feature: an oracle for a particular function a from N 
into N. (Here dom a is required to be all of N; let N% be the set of all such 
functions.) An oracle for @ is a device that, given a number x, responds by 
producing the value a(x). For a recursive a, we can make an oracle for a 
from a Turing machine. But more generally we can imagine an oracle for 
an arbitrary function a. Our computing agent supplied with this oracle now 
can calculate not only the effectively computable partial functions, but can 
further calculate (when given the right instructions) any partial function 
that is ‘“‘computable in a@’’. 

At first glance, the concept of computability in @ seems quite odd. It 
combines the most constructive approach to functions (that of computabil- 
ity) with the least constructive approach (that of an oracle). But despite this 
paradoxical appearance, the concept has proved to be valuable. And it led 
eventually (in the 1950’s) to the concept of a recursive functional, i.e., a 
recursive function accepting members of N“ as arguments. 

In general we will consider (k, m)-place partial functions; the domain of 
such a function is a subset of 


N=NXNX-->XNXNNX NN X- +e xk NN 


(with k factors of N and m factors of N*) and its range is a subset of N. Until 
now, we have discussed only the case where the space NW was countable 
(ie., m =0). Henceforth we will often treat the case k =m =1 for 
notational simplicity, with the understanding that the remarks generalize. 
We use z,),... aS variables over the space W. 

We now extend our notion of Turing machines to allow for m oracles. A 
machine can now write a number x on the tape and the oracle will, in one 
step, replace it with a(x). A partial function f on WN is defined to be 
recursive if there is a Turing machine that computes f as before, where now 
the function arguments of f are supplied in the form of oracles. 

As in Section 4, it is possible to encode the entire history of a single 
calculation into one number y, the computation record. Among other 
things, y encodes all information supplied by the oracle. Of course in any 
one terminating calculation, only a finite amount of the potentially infinite 
wisdom of the oracles can be utilized. Under any reasonable encoding of 
calculations, if y is a computation record then any value a(i) supplied to 
the calculation by the oracle will have i < y. Define for a in N* the “‘course- 
of-values” function @ by 


&(y) = (a0), @(1),...,a(y — 1)). 
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Thus & is again in N“ and @(y) encodes the first y values of a. (In particular 
a(0)=( )=1, no matter what @ is.) For a computation record y, the 
number @(y) enodes all values of a that were used in the calculation, since 
for i < y the value a(i) can be decoded from &(y), in fact a(i) = (a(y)):. 
This phenomenon leads to the following two results. 

(i) There is a recursive 4-ary relation T that holds of the natural numbers 


€,(X1,..-, Xe), (@ily),..-;@m(y)), and y iff e encodes a Turing machine 
and y is a computation record for that machine, started with 
'x,10'x.!0---0'x,! on the tape and supplied with oracles for ai,..., Om. 


(ii) There is a recursive function U such that whenever T holds of the 
above-mentioned four numbers, then U(y) is the output value of the 
calculation. 

Thus we can extend the normal form results of the preceding section by 
defining the (k, m)-place partial function {e}*" where 


{e}'"(x, a) = U(uy T(e, (x), (@(y)), y)). 
As usual, we omit the superscripts whenever possible. The Normal Form 
Theorem 4.1 then holds, mutatis mutandis. It is interesting to note that 
since U and T have only natural numbers as arguments, recursiveness on 
N can be characterized in terms of recursiveness on N. 


6. Recursive enumerability 


We have defined a subset of WV to be a recursive (k, m)-ary relation if its 
characteristic function was recursive. By Church’s thesis, this is the correct 
formalization of the informal notion of a decidable set. 

Now we want to consider sets that are only half recursive. Call a set R 
semi-decidable if there is an effective procedure that, given z, replies ‘‘yes”’ 
iff x € R. The procedure is no longer required to be a decision procedure; 
now it can be thought of as an accepting procedure. If x€ R, then the 
procedure eventually says ‘“‘yes’’, thereby accepting z. But ifz ¢ R, then in 
general the procedure will never terminate. But one never knows in 
advance whether the procedure will go on forever or will eventually halt 
and accept z. 

When W is a countable space, we can give another characterization of 
the semi-decidable sets. Call R effectively enumerable if there is an effective 
procedure that lists, in some order, the members of R. (Of course if R is 
infinite then the list will never be completed. But for any particular 
member of R, it appears on the listing after some finite length of time.) To 
prove that effective enumerability is equivalent to semi-decidability, first 
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assume that R is effectively enumerable. Then given any x, we can scan the 
listing of R as it appears, and say ‘yes’ if and when we see x. This shows 
that R is semi-decidable. Conversely assume that R is semi-decidable. To 
generate a listing of R, we must budget our time sensibly. Order N* first 
according to maximum component and then lexicographically; this orders 
N‘ in type w. Then go through all k-tuples in order: x1, x2,.... At stage n 
of the listing procedure, spend n minutes on each of x1, x2,...,X,, testing 
them for acceptance into R. If any of these tests results in a “‘yes’’, then put 
that k-tuple on the output list. In this way, any member of R is eventually 
discovered and placed on the list. (Obviously this argument relies on 
having a countable space N* ; an uncountable semi-decidable set cannot be 
listed in this sense.) 

Next we want to give a precise counterpart of the informal concept of a 
semi-decidable set. One possibility would be to go back to Turing 
machines, regarding them not as transducers (with both input and output) 
but as acceptors. But there is a simpler alternative open to us. Any 
semi-decidable set is the domain of the computable partial function taking 
the value 0 on the set and undefined outside the set. Conversely, the 
domain of any computable partial function is semi-decidable; one says 
“yes” if and when the computation terminates. Hence we can formulate 
semi-decidability as follows. 


6.1. DeFinition. A subset of W is semi-recursive if it is the domain of some 
recursive partial function on W. If W is countable, then semi-recursive sets 
are called recursively enumerable (abbreviated r.e.). 


If R is semi-recursive by virtue of being the domain of f, then we can 
think of the Turing machine that computes f as being the accepting device 
for R, where acceptance amounts to halting. The phrase “recursively 
enumerable”’ is sometimes used as a synonym for “‘semi-recursive” regard- 
less of the size of W, but we will confine the phrase to countable W. 

If we have an accepting device for R and another for its complement 
(with respect to Y), then the two devices together decide membership in R. 
Hence we have the following result. 


6.2. THEOREM. A relation is recyrsive iff both it and its complement are 
semi-recursive. 


We can exploit our indexing of recursive partial functions to obtain an 
indexing of the semi-recursive sets. Simply define W2™ to be the domain of 
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{e}*"". (We omit the superscripts whenever possible.) Then a relation R is 
semi-recursive iff it is W. for some e. Furthermore the (k +1, m)-ary 
relation 


Q= {(e, 2): xe W.} 


is semi-recursive (being the domain of the function computed by a 
universal Turing machine). The semi-recursive relation Q is “‘universal’’ 
for (k, m)-ary semi-recursive relations in the sense that a relation R is 
semi-recursive iff it is obtainable from Q by holding e fixed as a parameter. 


6.3. THEOREM. The following conditions on a (k,m)-ary relation are 
equivalent. 

(i) R is semi-recursive. 

(ii) For some recursive (k + 1,m)-ary relation Q, 


R = {z: dw Q(w, 2)}. 
(iit) For some recursive (k + l,m)-ary relation P, 
R = {x: Jw, --- dw: P(w,,..., Wn 2)}- 
Proor. We have (i) > (ii) because x © W. & Jy T(e, (x), y). Trivially 
(ii) > (iii). To prove (iii) > (i) we use sequence encoding: R =domf 


where f(x) = wwP((w),, (wW)2,..-.,(W), z). For recursive P, the partial func- 
tion f is also recursive. O 


In Section 4 we showed that the set 
K = {x: {x}(x) < ~} 
was not recursive. But K is a recursively enumerable subset of N, since 
x EK © AyT(x, (x), y) 


for a recursive relation T. So we may conclude that the complement K is 
not r.e. 

Although K is undecidable, there is a sense in which questions about 
membership in any r.e. subset of N are reducible to questions about K. 
Consider any r.e. subset W. of N and define for each x the function 


f(t) = {e} (x). 


Then f(t) is independent of ¢, and in fact f is the empty function if x ¢ W.. 
But f is total if x © W.. Now f is a recursive partial function, but more to 
the point is that we can recursively find an index z(e,x) for f. On an 
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informal level, this is clear: the equation displayed above tells how to 
calculate f, and we can arrive at this equation effectively when given e and 
x. Formally, we use the parameter theorem below. 

If g is a two-place recursive partial function, then g(8, y) is, as a function 
of y, recursive. A more significant fact is that we can recursively find an 
index for this function from.an index for g. This fact, stated more generally, 
is the following theorem. 


6.4. PARAMETER THEOREM. For each k and m there is a one-to-one total 
recursive function p such that 


{e}(X1, 265 Xny Viye- +> Ves Q1y- ++) Am) = 
= {p(e, (x1, 2-5 XV es Yes G1, -++; Om) 
always holds. 


Here x,,...,X, are parameters being held fixed. The idea is to have 
p(e, v) encode instructions for writing v to the left of the other input on the 
tape, and then following instruction encoded by e. (The parameter theorem 
is also known as the ‘‘S~m-n theorem’’, for historical reasons.) 

We can now apply the parameter theorem to the function 


g(e, x, t) = {e}(x) 
to get a one-to-one total recursive a such that g(e, x, t) = {a(e, x)}(t) and 


hence 
x EW. > {r(e,x)} is total, 


x W. > {r(e,x)} is empty, 
xEwW. © we, x)EK. 


This reduces questions about W. to questions about K. There are other r.e. 
sets besides K for which such reductions exist. The most obvious example 
is {(x, y): x € W,}. 

For subsets A and B of N, define A to be many-one reducible to B 
(A <,,B) if for some total recursive f, 


xEA & f(x)EB. 


Call A one-one reducible to B (A S, B) if in addition f can be required to 
be one-to-one. Then any r.e. subset of N is one-one reducible to K. It is 
clear (at least on the informal level) that if either A =,,B or A =, B and B 
is recursive, then A is also recursive. The same is true with ‘“‘recursive”’ 
replaced by “‘recursively enumerable’’. 


cH. C.1, §6] RECURSIVE ENUMERABILITY 545 


The parameter theorem is a standard tool in formalizing reductions of 
one decision problem to another. Such a reduction may prove that a 
decision problem is unsolvable, as in the following result. 


6.5. THEOREM (Rice [1953]). Let € be a set of one-place recursive partial 
functions. Then the set {e: {e} € €} of indices of members of © is recursive iff 
either € is empty or € contains all one-place recursive partial functions. 


Proor. The ‘“‘ <”’ half is trivial. So assume that the set of indices 
I ={e: {e}E€ } 


is recursive. Since both the hypothesis and the conclusion of the theorem 
are symmetric with respect to @ and its complement, we may suppose that 
the empty function @ is not in @. We will show that @ is empty by showing 
that we could otherwise reduce membership questions about K to the 
recursive set I. 

So assume that, contrary to our hopes, some function w is in @. The idea 
is to end up with x € K ©& g(x) €I by arranging to have {g(x)} be & or @ 
as x either is or is not in K. Informally, g(x) encodes instructions for: given 
y, compute first {x}(x) and then w(y). Formally, g(x) = p(e, (x)) where 


{e} (x, y) as U(uz[T(x, (x ), (z Jo) & T(q, (y ), (z )))): 


and q is an index for w. This gives us K =, I, contradicting the fact that I is 
recursive and K is not. O 


As immediate consequences of Rice’s theorem, we have the following 
negative statements. The set of indices of total recursive functions is not 
recursive. For any fixed recursive partial function f on N, the set of indices 
of f is not recursive (and hence is infinite). The set {e: W. is finite} is not 
recursive. And so forth. 

A more subtle consequence of the parameter theorem is the recursion 
theorem, due to KLEENE [1938]. 


6.6. RECURSION THEOREM. (i) For any total recursive f : N—>N we can find a 
number e for which {e} = {f(e)}. 
(ii) For any recursive partial function g we can find a number e such that 


{e}(z) = g(e,z) for all x. 


The proof is very like the proof that gives us self-referential sentences in 
number theory (e.g. Theorem 2.2.1 in Chapter D.1). 
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Proor. Parts (i) and (ii) are equivalent. To prove (i), we obtain from the 
parameter theorem a total recursive y such that {y(x, y)} = {{x}(y)} for any 
x and y. Let r be an index for the function whose value at x is f(y(x, x)) 
and let e = y(r,r). This number e works: 


fe} = {y(n r)} = {ir} (r)} = (flv rt = {fle}. 


To prove part (ii), we first get from the parameter theorem a total 
recursive f such that {f(t)}(z) = g(t,z). Then by part (i) there is a number e 
such that {e}(z) = {f(e)}(z) = g(e,z). O 


We can use the recursion theorem to give a short proof of Rice’s 
theorem. Suppose that {a} € € and {b} € ©, and define f(x) to be b or a as 
x is or is not in I. There can be no e such that {e} = {f(e)}, and hence f 
cannot be recursive. So I is not recursive. 


7. Logic and recursion theory 


Why is recursive function theory part of mathematical logic? If logicians 
had not invented recursive functions, computer scientists would have 
developed the subject later. But it was not a mere historical accident that 
recursive funtions were invented by logicians. There are certain aspects of 
logic that inevitably involve the notions of constructiveness and effective- 
ness. 

A basic concept of logic is that of a proof. Now a proof, viewed 
abstractly, is a series of statements that “‘establishes’’ without doubt the 
truth of its conclusion, given the truth of its assumptions. But to establish 
convincingly the truth of the conclusion, the proof must be verifiable by 
others. There must be some procedure by which an outsider can verify the 
correctness of the proof, without having to supply brilliant insight. That is, 
it must be possible to verify the correctness of proofs by an effective 
procedure. The set of proofs must be recursive. It would not do, for 
example, to consider just any series of true sentences of arithmetic to be a 
proof of its last line. We cannot, given a sentence of arithmetic, tell 
effectively whether or not it is true, because the set of true sentences of 
arithmetic is not recursive nor even r.e. (Section 10). 

Now consider the set of all theorems, i.e., the provable sentences. A 
sentence o is provable iff 


dd [d is a proof of a]. 
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The part in square brackets must be recursive. And so the set of theorems 
must be recursively enumerable. Thus as long as we can effectively 
recognize correct proofs, the set of theorems will be recursively 
enumerable! The Gédel incompleteness theorem discussed in Chapter D.1 
stems from the fact that provability is r.e. whereas truth is not. 

To be more specific, consider a first-order language L, such as the 
language for set theory, having finitely many non-logical symbols. (Actu- 
ally there could be No non-logical symbols as long as they are arranged 
tidily.) We first assign numbers (called Gédel numbers) to the expressions 
of the language in a straightforward way. This permits us to apply notions 
of recursion theory to the expressions. (Alternatively we could have Turing 
machines work directly on the symbols of the language.) In fact we will not 
bother to distinguish between an expression and its Gédel number. One 
can verify that the set of formulas is recursive, as is the set of sentences. 
Now add a set A of axioms, such as the Zermelo-Fraenkel (ZF) axioms of 
set theory. We naturally expect A to be recursive, so that in verifying the 
correctness of a proof we will be able effectively to tell the axioms from the 
non-axioms. (For example, the set of ZF axioms: is recursive.) For a 
recursive set A, the binary relation 


{(a, d): d is a proof of o from A} 


is recursive, where “‘proof’’ is defined as in Section 4 of Chapter A.1. (In 
fact we could use either formal system from that chapter.) This is not a 
deep result; we intuitively expect proofhood to be decidable, so when the 
concepts involved are made precise it should not be surprising to find that it ; 
is indeed decidable. Call a theory recursively axiomatizable if it is given by 
a recursive set A of axioms in a language L as above. 


7.1. THEOREM. A recursively axiomatizable theory has a recursively enu- 
merable set of theorems. 


Proor. 7 is a theorem iff 
Ad [d is a proof of + from A] 


where A is the recursive set of axioms. The part in brackets is 
recursive. (] 


Take again the example of ZF set theory. By Theorem 7.1, the set of 
theorems of ZF is r.e. It follows that the sentences of arithmetic provable in 
ZF (this can be made precise) form a r.e. set. So they cannot coincide with 
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the true sentences of arithmetic. Either too much is provable (and ZF is 
lying to us) or too little is provable. 

Now let us go one step further and suppose that we have a recursively 
axiomatizable theory that is complete (i.e., for any sentence a, either o or 
(1c) is a theorem). Then we can strengthen Theorem 7.1; the theory is 
actually decidable. 


7.2. THEOREM. A complete recursively axiomatizable theory has a recursive 
set of theorems. 


Proof. The conclusion certainly holds if the theory is inconsistent, so 
assume the theory is consistent. Suppose we are given a sentence o and we 
want to decide whether it is a theorem. We generate a listing of all the 
theorems; by Theorem 7.1 this is possible. Eventually either o or (Tc) 
appears in the listing. When this happens, we can stop and give the correct 
answer. [) 


Theorem 7.2 is the basis for a number of decidability results; see Chapter 
C.3. Of course it suffers from the limitation of being applicable only to 
theories that are complete. 

Properly viewed, proofs and calculations are objects of the same sort. A 
calculation (written down with all the steps) is a proof that the value of a 
function of a given argument is a certain number. And a proof is a 
calculation of one value of the function whose domain is the set of 
theorems. It is a calculation in the sense of being a finite and verifiable 
record that correct procedures have been followed. 

For example, it turns out that a set R of numbers is recursive iff it is 
binumerable in first-order Peano arithmetic (cf. Section 3). Here the role of 
Turing machine computations is played by the formal deductions, modus 
ponens and all, establishing that a given number is indeed in the set. 

Turing machines themselves have proved to be convenient tools in a 
variety of undecidability problems in logic. Take for example the result of 
Kaur, Moore and Wana [1962] that for any formula ¢ we can effectively 
find an V3V formula that is satisfiable iff ¢ is satisfiable. This is not proved 
by syntactical manipulations on ¢, but instead by finding for each Turing 
machine M an VAV formula that is satisfiable iff M never halts. This, 
together with results of the previous section, yields the existence of the 
desired reduction. 

Or suppose we want to prove that the set of sentences having models of 
every non-zero cardinality is undecidable. We can do this by showing how, 
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given a Turing machine M, to find effectively a sentence that has models of 
every non-zero cardinality iff M never halts. 


8. Degrees of unsolvability 


All of the non-recursive sets have unsolvable decision problems. But 
some are more unsolvable than others. In this section we will see how some 
amount of order can be imposed on the unsolvable problems. 

Consider a partial function f on W, and let @ be a subset of N®. It may be 
possible to compute f if we are given oracles for each function in &. Define 
f to be recursive in B if there exists some recursive partial function g and 
some B,,..., Bn in B such that 


f(@)= g (2, Bi,-.-, Bn) 


for all x. Other definitions can then be “‘relativized to @”’. For example a 
subset of W is recursive in B if its characteristic function is recursive in &, 
and it is semi-recursive in B if it is the domain of some partial function 
recursive in %. Usually ¥ will be a singleton {8}, so that we speak of 
recursiveness in 8 and so forth. 

The extreme case of % = N* deserves special mention. When we give 
ourselves oracles for all functions in N‘, matters of recursiveness are 
washed out. On a countable space W, every function is recursive in N*. But 
for uncountable W this does not happen. If f is recursive in N* then in 
calculating f(a) there is still one restriction: We can use only a finite 
amount of information about a. That is, there must be some y (depending 
on a) such that f(a) = f(y) for any y agreeing with a at the first y values. 
This is exactly the condition for f to be continuous, when we put the 
discrete topology on N and the product topology on N*. The space W is 
then given the product topology. 


8.1. THEOREM. (a) A function on N is recursive in N™ iff it is continuous. 
(b) A subset of N is semi-recursive in N* iff it is open. 


All the uncountable spaces W are homeomorphic to N*. For example a 
homeomorphism from N*xN*% onto N% can map (a, 8) to the function 
taking 2x + a(x) and 2x +1 B(x). And N% is homeomorphic to the 
irrationals; the mapping here uses continued fractions. Thus it is possible to 
give a uniform treatment of topological! set theory of the irrationals on the 
one hand and recursion theory of V on the other. Both sides gain from this 


550 ENDERTON/ RECURSION THEORY [cH. C.1, §8 


connection. For further discussion on this vein, see Chapter C.8 on 
descriptive set theory. 

But we have strayed from our main topic. We will be concerned with a 
special case of relative recursiveness. Let a and B be members of N*. Then, 
by our previous definition, @ is recursive in B iff there is a recursive partial 
function g for which 


a(x) = g(x, B). 


(Although both @ and £ are total, we cannot demand that g be total.) If a 
is recursive in B, then we write @ <7 and say that @ is Turing reducible to 
B. We can also (and in fact equivalently) work with subsets of N: say that A 
is recursive in B (written A <,B) if the characteristic function of A is 
recursive in the characteristic function of B. 


8.2. THEOREM. The binary relation <=, (either on N* or on P(N)) is reflexive 
and transitive. 


On an informal level, transitivity of <, corresponds to connecting 
machines in series. 

As a consequence of the above theorem, the symmetric relation of 
Turing equivalence 


a=rB iff a=rB and B =ra 


is an equivalence relation on N*. Let [a] be the equivalence class of a. The 
equivalence classes are called degrees of unsolvability. The degrees are 
partially ordered by the relation 


[a])=[B]) iff a=;£B. 


(Clearly this 1s well defined on equivalence classes.) We get the same 
degree structure on A(N) as on N%, since for any @ we can find a set B with 
a Turing equivalent to the characteristic function of B. 

Thus the degrees of unsolvability are partially ordered according to just 
how unsolvable they are. There is obviously a least degree 0, consisting of 
the recursive functions. Because for any fixed function B the set 
{a: a =7B} is countable, it follows that each degree is a countable set of 
functions. Consequently there are 2"° degrees. Another consequence is that 
any chain of degrees — any linearly ordered subset — has cardinality at 
most N,. This strongly suggests that incomparable degrees exist. This 
suspicion can be proved to be correct (without having to deny the 
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continuum hypothesis). We can simultaneously construct @ and B so as to 
sabotage each machine that might reduce one function to the other. 

In fact much more is true. There is an antichain — a set of degrees no 
two of which are comparable — of cardinality 2°. This result is just a piece 
of the extensive information known about the degrees. See Chapter C.4 for 
much more on this topic. 

Call a degree recursively enumerable (r.e.) if it is the degree of some r.e. 
set. Recall that the set 


K = {x: {x}(x) <=} 


is a r.e. subset of N that is not recursive. Hence the degree of K, denoted 0’, 
is a r.e. degree greater than 0. In Section 9 we will consider the question 
whether there are other r.e. degrees (Post’s problem). 

The passage from @ to K can be relativized to give us a function A » A’ 
on Y(N). Define the jump of A (denoted A’) to be the set 


{x: {x}*(x)<} 
where {x}* is the partial function recursive in A with index x, i.e., 


{x}“(y) = {x}(y, x4) 


where yx, is the characteristic function of A. Then A’ is r.e. in A, but is not 
recursive in A; the proof is the same as for K. In fact by relativizing the 
proof of the corresponding result for K we have the following. 


8.3. THEOREM. (i) A’ is r.e. in A but is not recursive in A. 
(ii) A set Bis r.e. in A iff B=, A’. 


Thus among the sets that are r.e. in A, its jump A’ ranks highest with 
respect to one-one reducibility. It follows from the foregoing theorem that 
the jump operation is well defined on degrees. 


8.4. THEOREM. A =7B iff A'S, B’. 


This lets us define for each degree a its jump a’. By Theorem 8.3 we have 
a<a’. And so we can continue: a< a’<a”<.---. In particular there is no 
largest degree. And above any degree we can find a chain of order type w. 
In fact we can find one of the order type of the first uncountable ordinal; at 
limit ordinal steps we gather together all the previous sets in some 
systematic way. 
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9. Creative and lesser sets 


As observed in Section 7, any recursively axiomatizable theory has a r.e. 
set of theorems. Thus r.e. sets are of particular significance for logic. For 
example one could hope that by classifying r.e. sets one would obtain an 
interesting classification of axiomatizable theories. The binary classification 
of r.e. sets into the recursive and non-recursive sets partitions theories into 
decidable and undecidable theories. But one could hope for a refinement 
of this binary split. 

The first classification to examine comes from the degrees of unsolvabil- 
ity. And next one could examine the refinement obtained from other 
reducibilities. Clearly 


As,B>A58,B>A=rB, 
so when we define 

A=,B iff A<,B&B<,A, 

A=,B if As,B&BS,,A 


the equivalence relation =; is refined by =,, and further refined by =,. 
Section 6 shows that there is a largest r.e. degree (the degree of K) with 
respect to each of these reducibilities. 

But what about the other r.e. degrees? Each such degree contains 
axiomatizable theories: 


9.1. THEOREM (FEFERMAN [1957]). Every r.e. degree of unsolvability con- 
tains a recursively axiomatizable theory. 


Proor. For any set A of numbers we can form a theory in the language of 
equality by taking as axioms the set 


{m6,: nE A} 


where e¢,, is the sentence ‘‘there are exactly n things in the universe’. Then 
the set of theorems is Turing equivalent to A. If A is r.e. then we certainly 
have a r.e. set of axioms. But any theory whose axioms can be recursively 
enumerated {a 9,0,,02,...} has a _ recursive set of axioms 
{0,0 ANO1,0,;A0,AG,,...}. O 


HanrF [1965] has shown that this theorem can be strengthened by 
requiring that the theory be finitely axiomatizable. But the effect of the 
theorem is offset by the empirical observation that all ‘“‘natural” r.e. 
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theories are either of degree 0 or of degree 0’. Proofs that a theory T is 
undecidable generally show, in effect, that the halting problem for Turing 
machines is reducible to T. And the reducibility here is (or can become) =,, 
so that the proof shows that K =, T. If T is recursively axiomatizable, then 
we have T =, K. 

For the cases of =, and =,, we can give an intrinsic characterization of 
the sets in the highest r.e. degree. Theorem 6.2 states that a r.e. set A is 
non-recursive iff each r.e. subset of its complement A fails to fill A. By 
uniformizing this last condition, we obtain the following definition, due to 
Post [1944]. 


9.2. DEFINITION. A set A of numbers is creative if A is r.e. and there 
exists a recursive partial function f such that whenever W, CA then 
fxJEA— W,. 


The function f in this definition is called a productive function for A. For 
example, our set K is creative; we can take f(x) = x. 


9.3. THEOREM (MYHILL [1955]). The following conditions on a set of num- 
bers are equivalent. 
(i) A is creative. 
(ii) A is a r.e. set to which all r.e. sets are <,-reducible. 
(iii) A is a re. set to which all r.e. sets are <,,,-reducible. 


Trivially (ii) > (iii) and it is not hard to show that (iii) > (i). The main 
part of the proof is establishing (i) > (ii). This breaks down into two steps, 
first showing that a creative set has a one-to-one total productive function, 
and secondly using a version of the recursion theorem. 

The above theorem implies that the usual undecidable axiomatizable 
theories such as first-order Peano arithmetic have creative sets of theorems. 

There is still the question of what (if anything) lies between the recursive 
sets and the creative sets. For <,, and <, reducibilities, the existence of 
intermediate sets was established by Post [1944]. He noted that a creative 
set must have a very rich complement, and proceeded to construct sets with 
sparse complements. Call a r.e. set S simple if § is infinite but includes no 
infinite r.e. subset. 


9.4. THEOREM (Post [1944]). Simple sets exist. 


Proor. Imagine a fixed method of enumerating all r.e. sets. For each n, put 
into S the first discovered (if any) member of W,, that is greater than 2n. 
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Then S is r.e., S intersects every infinite r.e. set, but S contains at most n 
of the first 2n +1 numbers. DO 


A simple set is obviously non-recursive. And it is easy to generate an 
infinite r.e. subset of the complement of any creative set, so a simple set 
cannot be creative. Thus the above theorem establishes the existence of r.e. 
sets of intermediate =,, and =, degree. ‘‘Post’s problem”’ is the question 
whether there are r.e. sets of Turing degree intermediate between 0 and 0’. 
This question is not answered by the simple sets, which can be of degree 0’. 
Post had hoped that by placing even more stringent requirements on S, he 
could force S to be of intermediate degree of unsolvability. This approach 
turned out to be unsuccessful. Finally in 1956 (two years after Post’s death) 
the problem was solved simultaneously and independently by Friedberg (in 
his senior thesis at Harvard and in FrIEDBERG [1957]) and Mucnik [1956] in 
the Soviet Union. They showed that intermediate r.e. degrees do exist, and 
in great profusion. Their method of proof has been termed the “‘priority 
method’’. In broad terms, this method involves a construction in which 
there are infinitely many requirements to be satisfied. But some of these 
requirements conflict with one another, so at various stages of the 
construction one must satisfy requirements of high priority, allowing those 
of lower priority to be injured. If all goes well, at the end each requirement 
has received enough attention for the construction to succeed. 

Since 1956 the r.e. degrees have been studied extensively. We will quote 
here two theorems, both due to Sacxs [1964, 1963]. The r.e. degrees are 
dense, in that if a<c then a<b<ce for a third r.e. degree b. And any 
countable partial ordering can be embedded in the partial ordering of re. 
degrees. (For this it suffices to embed some countable atomless Boolean 
algebra — as a partial ordering — in the r.e. degrees.) 


10. Definability and recursion 


Mathematics is, as it has always been, largely the science of measure- 
ment. But ‘“‘measurement”’ must here be understood as referring to more 
than the meter stick. The genus of a topological figure measures one of its 
aspects; objects of genus zero are in a sense simpler than those of higher 
genus. There are many dimensions of measurement in mathematics, and 
they go by many names: characteristic, transcendence degree, cardinality, 
fundamental group, etc. Occasionally we are so successful in the science of 
measurement that we can completely characterize an object (at least to 


cu. C.1, $10] DEFINABILITY AND RECURSION 555 


within some concept of isomorphism) by giving, as it were, its latitude and 
longitude, i.e., its measurements in the relevant dimensions. 

Usually the measurement values in a particular dimension can be 
ordered or at least partially ordered. They then induce a ranking on the 
objects being measured, according to whether the measurement of the 
object yields a value that is small or large. 

Now consider the case of measurement of sets of natural numbers. This 
case is not as specialized as it might seem, since it is applicable to anything 
that can be encoded into sets of numbers, such as formal languages (sets of 
strings of symbols). First of all we have a binary measurement: a set of 
numbers can be recursive or non-recursive. For example the set of primes 
is recursive, and the set of theorems of set theory or group theory, suitable 
encoded, is not recursive. There are countably many recursive sets, so 
almost all sets (in the usual measure on A(N)) are non-recursive. 

The degrees of unsolvability provide one scale of measurement, indicat- 
ing how far a set is from being recursive. But the degrees themselves form 
an untidy array, with only a few reliable bench marks to help us get our 
bearings. 

The scale of measurement to be treated in this section is definability. It is 
applicable, in the present context, to sets that are definable in the structure 
R= (N, 0, S, +,-) by formulas of first or second order. For each natural 
number n, let n be the corresponding numeral in the formal language of J. 
A formula ¢ (x) with just x free is said to define A in § if for every n, 


nEA SORE gn). 


(This notation is from Chapter A.1.) Thus A consists of exactly those 
numbers making ¢ true in Jt. In the case of a k-ary relation, we use a 
formula ¢(x,,...,X,) with several free variables. 

Obviously only countably many sets are definable. We can produce an 
artificial example of a non-definable set by diagonalization. Nevertheless, 
among the N, sets that are definable in 9t lie most of the interesting sets of 
numbers. After all, if we are interested in A, then we probably know what 
A is, and that knowledge can probably be formalized to yield a definition 
of A in &. 

If a set is definable in 9%, then we want to have a measurement indicating 
how definable it is. For a start, call a set arithmetical if it is definable in Jt 
by a first-order formula, and call it analytical if it is definable by a 
second-order formula. Then the arithmetical sets constitute a subclass of 
the analytical sets, and it is a subclass of sets that are more easily definable 
than are the sets in its complement. 
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Within the class of arithmetical (or analytical) sets, we can ask for finer 
measurements as to just how definable a given set is. For example, we 
could use as a measurement the length of the shortest defining formula, or 
the number of quantifiers. But to obtain intrinsically significant measure- 
ments, some care is needed. 

For example, exponentiation (as a ternary relation) is definable in 2. But 
it is not easy to find a defining formula, and any defining formula will be 
fairly long. Yet our decision to include addition and multiplication in 2 and 
to exclude exponentiation was rather arbitrary. We want measurements 
that are free from such arbitrary choices. For this reason we decide to 
measure ‘“‘definability modulo recursiveness’”. We have available the 
following theorem, which is essentially due to GOpEL [1931]. 


10.1. THEOREM. Every recursive relation on N is arithmetical. 


The proof uses the techniques of Section 4 to translate statements about 
Turing machines into statements about numbers. It is necessary to have an 
arithmetical way of encoding a sequence of numbers into a single number. 
The Chinese remainder theorem provides such a method. 

From this theorem we can further conclude that r.e. relations on N are 
also arithmetical, since the defining formula for {x: dy R(x, y)} requires 
only one quantifier more than the defining formula for R. And comple- 
ments of r.e. relations are arithmetical, and so forth. 

Before making that “‘and so forth” more systematic, we should note that 
we can now prove the undecidability of number theory. Let T be the set of 
sentences true in Yt, suitably encoded into numbers. The set K is 
arithmetical, and so for some formula ¢, we have x € K iff ¢(x) € T. But 
this yields K =, T, so T cannot be recursive. We will see presently that a 
modification of this argument shows that T is not even arithmetical. 

To return to definability, note that another consequence of Theorem 
10.1 is that the arithmetical relations on N are exactly those definable in the 
structure with universe N and with all recursive relations. This is the 
natural structure in which to measure definability modulo recursiveness. 
Any relation definable in this structure is of course definable by a formula 
in prenex form. We will use the number of alternations between universal 
and existential quantifiers in that prenex formula as a measure of 
definability. 

We can formulate these ideas in the following way, and for an arbitrary 
space NW. Let Hy be the class of recursive relations. Define 3,,,, to be the 
class of all projections of IZ, relations, where in the present context P is 
called a projection of R if 
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Thus if R is a (k +1,m)-ary relation, then P is a (k, m)-ary relation. We 
allow projection here only along numerical axes, not function axes. To 
complete the definition, let I7, be the class of complements of relations 
in %,,. 

An equivalent definition of 3, and IT, can, at least for countable W, be 
formulated in terms of the structure ® with universe N and with all 
recursive relations. A relation is in 3,,,, iff it is definable in ® by a prenex 
formula having n alternations between universal and existential quantifi- 
ers, and whose outermost quantifier is existential. I7,,, can be given a 
similar characterization, wherein the outermost quantifier is universal. (For 
uncountable NW, analogous characterizations are possible, if provision for 
function variables is made. We must allow formulas expressing @(x)= y.) 

Although we have deliberately shifted from the structure Jt to R, it was 
proved in 1970 that the above paragraph remains correct with 9 in place of 
§t. Matijacevit’s theorem that solved (or rather unsolved) Hilbert’s tenth 
problem shows that any r.e. set of numbers can be put into the form 


{y: Sx (p(x, y) = a(x y))} 


where x € N‘ and p and q are polynomials over N. For details on this 
theorem, see Chapter C.2. 

To return to the arithmetical hierarchy, it is convenient to define also the 
class 


A, = 5,1, 


when n=1. For example, Theorem 6.2 implies that A, is the class of 
recursive relations, since 3, i$ the class of semi-recursive relations and IT, 
is the class of complements of semi-recursive relations. By using vacuous 
quantifiers in defining formulas we can easily establish the following 
inclusions. 


The class of arithmetical relations is the union of all these classes. All the 
inclusions shown above are proper. This follows from the hierarchy 
theorem below, due to KLEENE [1943]. 
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10.2. THEOREM. For each k, m, and n: 

(i) There is a (k+1,m)-ary relation in %,,, that is universal for 
(k, m)-ary relations in 3,,,, (i.e., every (k,m)-ary relation is obtainable by 
holding the first numerical variable fixed as a parameter). 

(ii) There is a (k,m)-ary relation in %,,,, that is not in II... 


Part (ii) follows from part (i) by diagonalization, and part (i) follows from 
the Normal Form Theorem. 

It is easy to see that both 3, and JJ, are closed under many-one 
reducibility. They are not closed under Turing reducibility, since every set 
of numbers is recursive in its complement. 

We can now extend our proof that the set T of sentences true in 9 is not 
recursive, to show that T is not arithmetical. In place of K, take any %,,,, 
subset A of N that is not J7,,,. Then as before A =, T, so T cannot be 
I7,,,,;. Since n is arbitrary, T cannot be arithmetical. 

We can relate the arithmetical hierarchy to the jump operation on 
degrees of unsolvability, thus linking the definability concepts with the 
ideas of Section 8. The jump operation was there characterized by an 
existential quantifier, which corresponds to projection of relations. This 
line of thought leads eventually to the theorem below, which is proved by 
iteration of Theorem 8.3(ii). Let @ be the result of applying the jump 
operation n times to 9. 


10.4. THEOREM (Post [1948]). A subset of N is in 3,., iff it is re. in B 
(and this holds iff it is one-one reducible to g*?). It is in A,,,, iff it is 
recursive in 6". 


Recall that the analytical sets are those definable in N by formulas of 
second order. As in the arithmetical hierarchy, we can use the number of 
alternations between universal and existential quantifiers (now quantifying 
function variables) as a measurement of definability. Let 


} = 17} = the class of arithmetical relations. 


Define ¥),, to be the class of all projections along function axes of 
relations in J7), Thus a relation in x", is of the form 


{z: da, --:da,R(z, a,,...,a)} 


where R isin JT), Define m7 to be the class of complements of relations in 
bh and define 4) to be SiN M7). 
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The simplest facts about the arithmetical hierarchy can be extended to 
the analytical hierarchy. We have the inclusions: 


x = 
COTO MG ME « 0s 
A; 43 43 
er i oom . = 
1 2 


All inclusions here are proper, by a hierarchy theorem directly analogous 
to Theorem 10.2. 

These classes in the analytical hierarchy enjoy stronger closure proper- 
ties than do their counterparts in the arithmetical hierarchy. This is 
illustrated by Theorem 10.5 below, which is a quantitative version of the 
transitivity of definability. The classes }* and 5 ,,° are defined by replacing 
recursiveness by the relativized notion of recursiveness in B. 


10.5. TRANSITIVITY THEOREM. Assume that CN and B € N™. 
(i) AEX, & BEALL > BE Sinsnti 
(ii) PES Ae REA S Wes, oes, 


The proof of (i) is straightforward. The proof of (ii), due to SHOENFIELD 
(1962], begins by writing 


rE€8 O Ay[y=B& OG, y)] 


where Q isin 5}. The expression on the right is then put into prenex form. 
Part (ii) prevents having an analogue of Post’s theorem (Theorem 10.4) 
hold for the analytical hierarchy. 

What is in A}? A clue is provided by descriptive set theory. When we 
relativize recursiveness to N“, then ¥} becomes the class of projections of 
Borel sets of finite rank. These are the analytic or A-sets of descriptive set 
theory. Souslin’s theorem SousLin [1917] shows that a set and its comple- 
ment are both analytic iff the set is a Borel set. This suggests, by analogy, 
that A! consists of the sets obtained by extending the arithmetical 
hierarchy into the transfinite. This is exactly correct; the hierarchy is called 
the hyperarithmetical hierarchy. It can be constructed by suitably iterating 
the jump operation over all ordinals that are “‘recursively countable” in the 
sense of being the order type of some recursive well ordering of numbers. 

Determining the location of a given set in these hierarchies reduces to 
two problems. First there is the (usually easy) matter of establishing the 
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positive fact that the set in question does belong to, say, 53. Then there is 
the negative task of showing that this is the best possible result, by showing 
that the set does not belong to the dual class II}. Take for example the 
subset R of N™ consisting of the total recursive functions from N into N. 
Then R € 3;, because 


aER © Fe Vx Ay [T(e, (x), y) & U(y) = a(x)] 


and the part in brackets is recursive. SHOENFIELD [1958] has shown that R is 
not in J7,. 

We have seen that the set of true sentences of arithmetic is not 
arithmetical. This set does lie in the hyperarithmetical hierarchy, at level w. 
This is a sharpening of Gédel’s incompleteness theorem, which asserts that 
the set is not r.e. But now consider the characteristic function 7 of the set of 
true sentences. Then the singleton {7}, as a subset of N°, is in I,. To prove 
this we look closely at the definition of satisfaction in Jt (Definition 3.8 of 
Chapter A.1). Viewed as conditions on 7, the definition is a IT, statement 
that is true of 7 and nothing else. Since {7} is in IT, it follows easily that 
truth is in A}: 


s is true & Va [a €{r} > a(s) = 1] 
© da [a € {r} & a(s) = 1]. 


The class of well orderings of N can be identified with a subset of N*, 
namely 
{a EN™: {(x, y): a@((x, y)) = 0} well orders N}. 


It is easy to see that this set is in J7}; we just write out its definition 
carefully and then count quantifiers. The set is not in i; in fact by a 
classical result it is not even analytic. 


11. Recursive analogues of classical objects 


The recursive functions are those you can actually write computer 
programs for; the others are more slippery and elusive. If one wants to 
approach mathematics from a constructive viewpoint, the recursive func- 
tions have a firmer ontological status than the others. One can, in principle, 
approach a mathematical object (the set of countable ordinals and the set 
of real numbers will be featured examples), and extract that part of it that is 
sufficiently constructive to be treated by recursion theory. This constructive 
part can then be viewed as a recursive analogue of the original. 
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As a preliminary example, let us consider the extent to which the 
collection R of total recursive functions on N is an analogue for the 
collection N* of all functions on N. There is the obvious disparity of size; N“ 
is uncountable whereas there are only countably many recursive functions. 
But the complete analogue of the uncountability of NY would say that there 
is no recursive One-to-one correspondence between the natural numbers 
and indices of total recursive functions. One formulation of this phenome- 
non is the following simple result. 


11.1. THEOREM. There is no total recursive function f on N such that 
{{f(n)}: n © N} coincides with the collection R of total recursive functions. 


Proor. As in the proof of the uncountability of N“, we diagonalize. The 
equation g(x) = {f(x)}(x)+1 defines a total recursive function that f has 
omitted. There is another proof that gives more quantitative information. 
From the fact (see Section 10) that R is not II,, we conclude that there 
cannot be any 2, set A such that R = {{x}:x EA}. O 


The operations of addition, multiplication, and composition of functions 
are applicable to R as well as to N™. But from the recursive viewpoint we 
need additional information: it must be possible, given indices of f and g, 
to find effectively indices for f + g, f-g, and f°g. To prove that this can 
indeed be done, observe for example that {x}(z)+{y}(z) is a recursive 
partial function of x, y, and z, and apply the parameter theorem. 

We can summarize this by saying that the operations of addition, 
multiplication, and composition of functions are “effective on the indices’. 
That is, each of our functions has indices that denote it, and operations on 
the functions are induced by operations on the indices, which in the present 
case are recursive operations. This phenomenon will recur in the subse- 
quent examples: the constructive members of the classical object come 
with indices that denote them, and effective operations must work with 
these names. 

For a more serious example of a recursive analogue of a classical object, 
take (as a classical object) the set of countable ordinals. Knowing that 
ordinal numbers are useful in transfinite constructions (as in the Borel 
hierarchy), we should not be surprised to find that a recursive analogue 
would be useful in transfinite constructions in recursion theory (as in the 
hyperarithmetical hierarchy, mentioned in the preceding section). 

A cheap analogue is provided by the set 


W = {x: {x}”° encodes a well ordering}, 
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where f is said to encode a well ordering if f is a total function from NX N 
into {0, 1} and {(x, y): f(x, y) = 0} is a well ordering of a subset of N. More 
briefly, we can refer to W as the set of indices of recursive well orderings. 
For x in W, let |{x|| be the ordinal number of the corresponding well 
ordering. Since an initial segment of a recursive well ordering is another 
such ordering, the set {||x ||: x © W} is an initial segment of the countable 
ordinals. Its least upper bound will be called ‘recursive w,’’. Although 
recursive w, is countable, it is the first ordinal that is not recursively 
countable. 

Unfortunately, W has some deficiencies that impair its usefulness. For 
example, given x in W, we cannot tell effectively whether || x || is a successor 
ordinal. Even if it is a successor ordinal, we cannot effectively find a 
member of W denoting the predecessor. 

A better quality analogue is provided by a system of ordinal notations 
introduced by KLEENE [1938]. The set O is built up from below, unlike the 
set W. We will give an inductive definition simultaneously for the set O, the 
binary relation <o, and a map x |x| from O into the ordinals. (The 
definition differs only slightly from Kleene’s version.) 

(i) 1€ O and |{1|=0. Thus 1 is the unique number denoting the ordi- 
nal 0. 

(ii) If x €O then 2*€O, x <o2*, and {2*|=|x]+1. This is the 
successor step. 

(ili) If x <oy <oz then x <gz. The relation <o will in the end be a 
partial well ordering on O. 

(iv) If {e} is a total recursive increasing sequence of notations (i.e., 
{e}; N>O and {e}(n)<of{e}(n+1) for each n), then 3-5°€O, 
{e}(n) <o 3-5* for each n, and |3-5°| = sup{|{e}(n)|: n EN}. 

This inductive definition assigns notations to an initial segment {|x |: 
x € O} of the countable ordinals. It turns out that the least ordinal not 
receiving a notation is again recursive w,. In fact there are several other 
ways to characterize recursive w,: it is the the least ordinal that is not the 
order type of any 2; well ordering of numbers, and it is the least admissible 
ordinal after w (in the sense of Chapter C.5). 

As a final example of recursive analogues, we will consider recursive real 
numbers. This subject was first treated in TuRING [1936]; a more recent 
reference is Rice [1954]. Any real number can be approximated by 
rationals, but we want to single out those numbers for which we can 
effectively generate rational approximations, and with a known estimate of 
the error. These are the realest of the reals. 

Just as there are several ways to construct the reals from the rationals, so 
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are there several equivalent ways of making precise the concept of a 
recursive real. For example, we could require that the binary expansion of 
the fractional part of the number be given by a recursive funtion. Equiva- 
lently, we can adapt the Cauchy sequence construction. To each natural 
number n assign the rational number r(n) = ((n))— (n),)/((n). + 1). Then 
say that e denotes a recursive real if {e} is total and whenever m > n then 


Ir(e}(n))~ r(fe}(m))] <5R 


The essential feature of both definitions is that we can effectively produce 
both upper and lower rational bounds converging to the number. 

The recursive reals form a field, since for x#0 we can get rational 
bounds of the same sign which then convert to rational bounds on 1/x. (But 
given that e denotes a recursive real, there is no effective way to decide 
whether that real is zero.) Not only do we get a field, but the algebraic 
operations are “‘effective on the indices”, e.g., there is a recursive function f 
such that if a and b denote recursive reals then f(a, b) denotes their sum. 

The algebraic real numbers are all recursive. In fact it is not too hard to 
see that the recursive reals form a real closed ordered field, i.e., an ordered 
field in which any polynomial that changes sign has an intervening root. 
Consequently the result of adjoining W—1 to the field is algebraically 
closed. (See VAN DER WAERDEN [1953], section 70.) Furthermore by Tarski’s 
theorem, the field of recursive reals is elementarily equivalent to the field 
of all real numbers (see Chapter A.2). 

In addition to the algebraic numbers, various transcendental numbers 
such as e and 7 are recursive, since there are well-known methods of 
churning out convergent upper and lower rational bounds. 

The recursive reals are not as kind to analysts as they are to algebraists. 
The following example occurs in Rice [1954]; see also SPECKER [1949]. Start 
with a non-recursive r.e. set K. There is a one-to-one total recursive 
function f whose range is K (think of f(n) as the (n + 1)-st distinct number 
to emerge in an effective enumeration of K). Consider the set of rational 
numbers: 

This is a bounded recursive set of rational numbers. But its least upper 
bound (which is the only limit point of the set) is not a recursive real 
number, lest K be recursive. 

Finally, let us consider the analogues of functions of a real variable. A 
function F from the set of recursive reals into itself is called a recursive 
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operator if there is a recursive partial function f such that whenever e 
denotes a recursive real x then f(e) is defined and denotes F(x). Thus f, 
working on the indices, performs F. We state without proof the following 
result. 


11.2. THEOREM (KREISEL, LACOMBE and SHOENFIELD [1959], Ceitin 
[1959]). Every recursive operator is continuous. Moreover, given an eé 
denoting a recursive real and a positive rational e, we can effectively find a 6 
that works. 
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Introduction 


Many important mathematical problems take the form: find an al- 
gorithm (i.e. an effective procedure) by means of which it can be deter- 
mined (in a finite number of steps) for each element of a given set, whether 
or not the element possesses some given property. The “‘solution”’ of such a 
“decision” problem is then to consist of actually exhibiting an algorithm 
and providing a proof that the algorithm does what is required of it. 

A standard example from the elementary theory of numbers is to give an 
algorithm by means of which it can be determined for a given ordered 
triple (a, b,c) of integers, whether or not the equation 


ax+by=c 


has a solution in integers. In this case, a solution is: find the largest natural 
number d which simultaneously divides a and b and then test whether or 
not d is a divisor of c. (The equation will have an integer solution just in 
case d does divide c.) Now, the development of recursion theory with its 
precise explication of the intuitive notion of effective computability has 
made a new alternative available: instead of solving a decision problem 
positively by showing the existence of an algorithm, we can solve it 
negatively by proving that no algorithm meeting the requirements exists. 
Or, as we say, we can prove that the problem is unsolvable. Thus, Hilbert’s 
tenth problem, which sought an algorithm for testing an arbitrary polyno- 
mial equation P(x,,...,x,)=0 with integer coefficients for possessing a 
solution in integers, was shown to be unsolvable by MATUACEvic [1970]. 

The only prerequisite for this chapter is familiarity with Sections 1-4 of 
Chapter C.1, Elements of recursion theory, which we cite as ERT and whose 
notation we follow. We shall survey the main unsolvability results which 
have been obtained for problems of genuine mathematical interest. In 
some cases we shall give complete proofs, in others we shall content 
ourselves with an outline of the main ideas. 

The mathematical content of an unsolvability result is that some 
particular set is not recursive. And in fact, if S is any non-recursive set of 
natural numbers, we can assert the unsolvability of the decision problem: to 
determine of a given x € N whether or not x € S. Of course this conclusion is 
based on the identification of the intuitive notion “‘S is decidable” with the 
precise notion “S is recursive’’, i.e. what is usually called Church’s thesis. 
(Cf. Section 3 of ERT.) In this chapter we freely use Church’s thesis often 
without specific comment. As just indicated this is necessary if we are to 
conclude from the non-recursiveness of a set, that there really is no 
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corresponding algorithm. But in addition it will be used to simplify proofs: 
whenever we exhibit an algorithm for computing the values of a partial 
function f we will conclude without further ado that f is a recursive partial 
function. 

In addition to proving that certain mathematical decision problems are 
unsolvable, there has been considerable interest in calculations of the 
degree of unsolvability of such problems in the sense discussed in ERT, 
Section 8. However, we shall not discuss these matters in this chapter. 


1. The halting problem revisited 


The unsolvability of the halting problem is given in 4.2 of ERT in the 
form: 


The set K ={x: {x}(x)< ©} is not recursive. 


By the Normal Form Theorem (ERT, 4.1), {x}(x) is a recursive partial 
function of x. Hence by the discussion of Section 2 of ERT, there is some 
Turing machine M (which could in fact be concretely exhibited at the cost 
of some unpleasant labor) such that: 

If we start M at instruction go scanning the leftmost symbol of 'x! (with 
all the remaining symbols on the tape blank), then M eventually halts if 
and only if {x}(x) is defined, i.e. if and only. if x © K. In the rest of this 
chapter, M will always stand for this Turing machine. 

Since K is not recursive we conclude: 


1.1. . THEOREM. There is no algorithm for testing a given x EN to determine 
whether or not M will eventually halt when started at instruction qo scanning 
the leftmost symbol of 'x'. 


By weakening this result it assumes a more elegant form (recall from 
ERT, Section 4 that a configuration for a Turing machine consists of the 
current contents of the machine tape, the instruction number, and the 
square being scanned): 


1.2. CoroLLary. There is no algorithm for testing a given configuration of 
M to determine whether or not M will eventually halt when started at the 
given configuration. 


Of course the machiné M is formally a function (as explained in 
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Definition 2.1 of ERT) which can be concretely exhibited as a table with 
five columns and an appropriate number of rows, or as we shall say, 
quintuples. (Cf. the example in Section 2 of ERT of a Turing machine which 
computes x+y.) The halting problem for M is then a quite specific 
mathematical question: for which of a given set of configurations will a 
certain well-defined combinatorial process eventually terminate? 

The halting problem will play a fundamental role in this article. Indeed 
the unsolvability of all the problems which we will discuss follows from that 
of the halting problem. 


1.3. Derinition. A k-ary relation R on N is recursively enumerable (ab- 
breviated r.e.) if there is a Turing machine T such that whenever we start T 
at instruction qo scanning the leftmost symbol of 


"x1 0!'x.'0---0'x,! 


(with the rest of the tape blank), then T eventually halts if and only if 
(X1,...,X%)ER. 

Recursive enumerability of a k-ary relation on N is defined in Section 6 
of ERT (which is beyond our formal prerequisites) in a slightly different, 
but equivalent, way. The equivalence proof requires overcoming a not very 
profound technical hurdle. Of course we will use the present definition 
throughout this chapter. 

Intuitively, to say that R is r.e. is to say that we possess an algorithm 
which will ultimately find any correct “‘yes’” answers to questions of the 
form ‘‘Is it the case that (x,,...,x,)€ R?” but will fail to terminate when 
the answer is ‘‘no’’. 

Writing R for the complement of the k-ary relation R (i.e. (x,,..., %) © 
R if and only if (x.,...,x.) @ R), we have: 


1.4. THEOREM. R is recursive if and only if R and R are r.e. 


Proor. Let R_ be recursive and let its characteristic function (i.e. the 
function which is 1 when R is true and 0 when it is false) be computed by 
Turing machine Mo. Let qo, qi,..-,q, be the instructions of Mo. We modify 
M, to obtain new machines M* and M_ by adjoining additional states and 
quintuples. For each pair (i,a@), O= i =n, a = 0 or 1, for which M,’s table 
contains no quintuple beginning i, a (i.e. Mo will halt at once if it comes to 
instruction q, scanning a square on which a is written), we place in the 
tables of both M* and M” the quintuples: 


ia aR(nt1). 
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Now, when M, halts, the scanned square contains 1, and its right-hand 
neighbor contains 1 if R is true for the given inputs, and 0 if R is false for 
the given inputs. Thus, on arriving at instruction n+ 1, M* and M”™ find 
themselves scanning 1 or 0 according as R is true or false for the given 
inputs. Finally, we add to M” the quintuple 


(n+1)0 OR(n+1) 
and to M~ the quintuples 

(n+1)1 1R(n+2) 

(n+2)0 OR(n+2). 


Then M* halts for given 'x,'0---0'x,! if and only if (x,,...,x.)€ R and 
M~ halts if and only if (x.,....%)ER. 

Conversely, if R and R are r.e. we are given Turing machines M*, M~ 
which halt on given 'x,'0---0'x,! in case (x1,...,%*)E R or & R respec- 
tively. Thus, to test whether or not (x1,...,%) © R for given x1,...,x, we 
need only run M’*, M~ simultaneously until one of them halts. So, R is 
intuitively decidable, and, using Church’s thesis, it is recursive. 

Since the machine M used in our discussion of the halting problem 
eventually halts for given x if and only if x € K we have: 


1.5. THEOREM. The set K is r.e. but not recursive. Hence K is not r.e. 


2. Semi-Thue processes 


Let A ={a,,...,a,} be some given finite set of objects we call symbols. 
A is then called an alphabet, and a finite sequence of elements of A is 
called a string or a word on A. The basic operation on strings is 
concatenation. That is, if u = aj,ai,°** ai, V = a," °° a, are strings on A, 
we write 

UD = i,Qi,*** Gi,Aj, Aj," °° Aj 


Note that u(vw) = (uv)w. It is convenient to permit the null string A of 
length 0 where uA = A = y, for all strings u. 


2.1. DEFINITION. A semi-Thue production (on A) (or just production) is an 
ordered pair (g, g) of words on A. It is customary to write the production in 
the form: g — g. A semi-Thue process (on A) is a finite non-empty set of 
semi-Thue productions on A. 
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2.2. DEFINITION. Let P be the semi-Thue production g — g. Then we write 
u>pv (omitting the P when there is no ambiguity) if we have u = agb 
and v = agb for (possibly null) strings a, b. 


2.3. DEFINITION. Let IT be a semi-Thue process. Then we write u>yv if 
u>pv for some P ETI. We write u>jfv if there is a finite sequence: 
U=U Dre Dn +++ Du, = v, for n 21. (In particular u > fu.) Again, 
we omit the IT when no ambiguity will result. (In displays the symbols P or 
IT are set directly below the > and the * directly above it.) 

The word problem for a given semi-Thue process IJ is the decision 
problem: to find an algorithm by means of which it can be determined for 
given words u,v whether or not u>* v. 

We shall see that there are semi-Thue systems whose word problem is 
unsolvable. In fact we shall show how to obtain from any given Turing 
machine T, a corresponding semi-Thue system JI(T) such that an al- 
gorithm for solving the word problem for [I(T) can also be used to solve 
the halting problem for T. Together with the results of Section 1, this will 
yield the desired result. 

Thus let T be a Turing machine with instructions qo, q:,..., qn. Then, 
I1(T) will be a semi-Thue process on the alphabet: 


A = {0,1, qo, qi, -+-5 Qn qq’, A}. 


The successive configurations of T through the course of a computation 
will each be represented by a so-called Post word, that is a word on A of the 
form: 

huqvh (1) 


where u, v are words on {0, 1}, v# A, and 0 <i <n. Here the Post word (1) 
is to represent a configuration in which q, is the current instruction, the 
tape contents is uv (augmented by infinite sequences of 0’s on its left and 
right) and the initial symbol of v is being scanned. The h’s serve as 
punctuation marks; their importance will soon be plain. Since it is not 
precluded that u begin with 0 or that v end with 0, there are infinitely many 
different Post words corresponding to a given configuration of T. The effect 
of T on successive configurations will be mimicked (or as is sometimes said, 
simulated ) by productions of [7(T) which will have corresponding effects 
on the Post words. 

In fact, the productions of I7(T) are as follows: 

(i) For each quintuple in the table of T of the form 


ia BRj a, B € {0, 1} 
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I1(T) is to contain the three productions 
qa 0>fq; 0 
gal—Bpq 1 
qaah—>Bq, 0h. 
(ii) For each quintuple in the table of T of the form 
ia BLi a, B € {0, 1}, 
II(T) is to contain the three productions 
0ga>qO0B 
lqa>q1B 
hq, a— hq; 0 B. 

(iii) For each pair (i,a@), Oi <n, a € {0, 1} for which no quintuple in 

the table of T begins i a, I(T) is to contain the production 
qaqa. , 

(iv) Finally I7(T) contains the five productions: 

q0->4q 
ql-q 
qh—>q'h 
0q'>4q' 
1q’>q'. 

Now suppose that huq,a yuh, where a, y € {0, 1} is a Post word which 
corresponds to a given configuration of T and that the table of T contains 
the quintuple 

ia BRi. 
Then we have 
huqayvh >huBayovh 
mT) 
and in fact huagqyvh is a Post word corresponding to the next configura- 
tion T. (I.e. a has been replaced by B on the tape, the scanned square has 
been moved one to the right, and instruction q, is the new current 


instruction.) In the case where the Post word has the form huq,ah, we 
have: 
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huqah>huBq 0h, 
(Tv) 
i.e. the necessary blank has been appended on the right by the unique 
production which can be applied. Thus the total effect of the productions 
listed under (i) is to cause changes in a Post word exactly corresponding to 
the effect on a machine configuration of the quintuple: ia B Rj. It is easy 
to check that the productions listed under (ii) behave analogously for 
quintuples of the form: ia BLj. 

The productions listed under (iii) will be able to operate on a Post word 
just in case it corresponds to a configuration of T in which T has just been 
forced to halt. The effect is to replace the instruction symbol q; by q. 
Finally, if u, v are words on {0, 1}, we have with respect to the productions 
of (iv): 


huqvuh >huqh>dhuq'h>dhq'h. (2) 


Now let T begin at instruction go scanning the leftmost symbol of ‘x! 
(with the rest of the tape blank). The corresponding Post word is hqo'x!'h. 
First suppose that T will eventually halt. Then by the above discussion (in 
particular noting (2)), we have: 


* 
hqo'x'h >huqvh 3 hq'h. 
nT) nm(T) 
Conversely, if 


* 
hq 'x'h >hq'h, 


nr) 
then in the sequence 


hga'x!h=w>w>-:+D>u,=haq'h, 


each u, must contain exactly one q-symbol (the q-symbols are of course 
4o, Gis ++++4u9,q'), Since each production preserves this property. The 
productions of (i), (ii) each replaces some q; by a q;, those of (iv) can never 
replace a q,. Hence a production from (iii) must have been used at least 
(and therefore exactly) once. But this implies that T eventually halts. We 
have proved: 


2.4. THEOREM. The Turing machine T beginning at instruction qo scanning 
the leftmost symbol of 'x!' (with the rest of the tape blank ) will eventually halt 
if and only if: 
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* 
hq 'x'h > hq'h. 
(Tv) 


Setting T = M and using Theorem 1.1, we have at once: 


2.5. COROLLARY. There is no algorithm for testing a given x EN to deter- 
mine whether or not 


hqo'x'h > hq'h. 
TI(M) 


In particular: 
2.6. COROLLARY. The word problem for II(M) is unsolvable. 


We conclude this section with a technical observation about [7(T) which 
we will find useful in the next section: 


2.7, REMARK. If u is a word on the alphabet of A containing exactly one 
q-symbol, then there is at most one production P € I(T) such that u>pv 
for some word v. 


The remark simply reflects the fact that no two quintuples beginning 
with the same pair ia can be part of the table of a Turing machine. 


3. Semigroups and groups 


We have already remarked that the operation of concatenation of strings 
on the alphabet A = {a,,..., a} is associative. In algebraic language, this 
fact is expressed by saying that the set of strings forms a semigroup under 
concatenation. 


3.1. DerFinition. The inverse of the semi-Thue production g— g is the 
production g— g. A semi-Thue process I] is called a Thue process if for 
each P ET], the inverse of P is also in J. 

For any semi-Thue process I], the relation u >7,v is obviously reflexive 
and transitive. If [7 is a Thue process, then whenever 


u=uy2m>-:>u=v, (3) 


i i 
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clearly 


VEU > Unt Doi Duw=y, 

W i TT 
so that the relation u >7,v is also symmetric, and hence is an equivalence 
relation, which we write u ~;,v. We write [u] for the equivalence class 
containing u. It is also clear that whenever (3) holds, we have for all words 
wonaA 

wu=Wwu, > Wu>°'' > Wu, =Wwov 

a n W 

and 


uw=uw>D>Www>ss+ > uw =ovw. 
iW i 


Hence, if u~,v and u'’~,,v' we have 
uu’ ee vu’ . vo’. 
Thus, writing 
[u] -[u’] = [uu’], 
this product is well defined, and since the operation is associative, the 
resulting structure is a semigroup. 

If S is a semigroup defined in this way, the Thue process IT is called a 
presentation of S. Each pair (g, g) such that g — g (and hence also g — g) is 
a production of I, is called a relation of the presentation, and is usually 
written g ~ g (or even g = g). Finally the symbols belonging to A are 
called the generators of the presentation. Of course a given abstract 
semigroup may well have many different presentations (or indeed none at 
all). 

If IT is a semi-Thue process, we write IT for the Thue process obtained 
from IT by adjoining to it the inverse of each production of I. We have 


3.2. Lemma (Post [1947]). For any Turing machine T and x E N, we have: 
* 
hq'x'h >hq'h 
WT) 


if and only if 


hqo'x'h ra hq'h. 


nr) 


Proor. Since [1(T)C I(T) the lemma is obvious in one direction. 
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Conversely, let 


hqo'x'h=u > Ww >: > u=haq'h. 


(T) mr) mT) 


Suppose furthermore that 


u>uw>-: a i -> Un 


P2 Py-+ Prot Pa-1 


where Py.1,..., Pa-1 © I(T) but P, € M(T). Since no production of f(T) 
can apply to hq‘h, no inverse of such a production can yield hq'h as a_ 
result; i.e. i<n-—1. It is clear that the words u,...,u, each contain 
exactly one q-symbol, since the productions of 11(T) preserve this prop- 
erty. Let Q be the inverse of P; so that Q € /1(T) and u,., > ui. Since also 
Ujs1>p,., Uie2 and P,,,€ I7(T), Remark 2.7 implies that u; = ui+2. Hence we 
have 


Uy > U2 Dos DU > Miss D Un 


Pr P2 Pi-y Pi+2 Pa-1 
and the production P; has been eliminated. Repeating the process, all 


productions which do not belong to /7(T) can be eliminated, yielding the 
desired conclusion. [ 


Referring to the corollaries 2.5 and 2.6, we have: 


3.3. COROLLARY (Post [1947]). There is no algorithm for testing a given 
x EN to determine whether or not 


hqo'x'h ~ hq'h. 
TI(M) 


3.4. COROLLARY (Post [1947], MARKov [1949]). There is a presentation of a 
semigroup with an unsolvable word problem. 


The word problem for semigroups was first considered by THue [1914] 
long before the development of recursion theory raised the possibility of a 
negative solution. It was historically the first example of a decision problem 
whose original formulation was in no way related to logic and for which an 
unsolvability proof was given. 

We now turn to an important special case of aaiaas presentations: 
Let A contain an even number of symbols: 


A ={a,...,4,b,,..., be} 
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and let the relations of the Thue process JT contain all relations 
aib, ~ A, i=1,2,...,k. 


(Of course, in general, [7 will contain other relations as well.) In this case IT 
is called a group presentation and (as is readily verified) it is indeed a 
presentation of a group. An abstract group G which has a group presenta- 
tion is said to be finitely presented. Because of its importance in algebraic 
topology particular attention has been paid to the word problem for group 
presentations (or as one says: the word problem for groups). The 
Novikov—Boone theorem which asserts the existence of a group presenta- 
tion with an unsolvable word problem thus represented a major break- 
through. Unfortunately, space does not permit our giving a proof of this 
theorem here. We refer the interested reader to Boone [1959] (which is 
based on the construction of /7(M)), Novikov [1955], Britton [1963], 
RotMaN [1973], Chapter 12, and to McKENziE and THompson [1973]. The 
last paper referrred to is in BOONE, CANNONITO and LynDoN [1973] which 
contains a wealth of interesting material and references in this area. 


3.5. DEFINITION. We say that an abstract group G is recursively presented if 
there are elements c;,...,c, € G such that: 

(i) each element of G can be written in the form c™---+c™ where 
m,,...,m, € Z (here Z is the set of integers positive, negative, and 0), i.e., 
C1,..-,Cn generate the group G; 

(ii) the relation R ={(m,,...,m,)|cT™-::c™=e}, where e is the 
identity element of G is recursively enumerable. 


Of course there is no difficulty in speaking of r.e. relations on Z rather 


than N. E.g. we can set m = 2m if m =0 and m = —2m —1 if m <0 and 
say that for R to be r.e. means that {(m.,...,m,) | (m,..., ma) € R} 
is r.e. 


The Novikov-Boone theorem is an easy consequence of: 


3.6. THEOREM (HIGMAN [1961]). A group is recursively presented if and only 
if it is isomorphic to a subgroup of a finitely presented group. 


This surprising theorem shows that a recursion-theoretic notion (that of 
being a recursively presented group) is equivalent to a purely algebraic 
notion. Simplified proofs of Higman’s theorem can be found in SHOENFIELD 
[1967], Appendix, AANDERAA [1973], and Rotman [1973], Chapter 12. 

Let us say that a class G of groups is: 
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(i) invariant, if G ©G and H isomorphic to G implies that H EG; 
(ii) non-trivial, if there exist finitely presented groups G, H such that 
Ge€EG and H€G; 
(iii) hereditary, if GEG and H a subgroup of G implies that H EG. 
It has been shown that: 


3.7. THEOREM (ADIAN [1955], RABIN [1958]). Let G be a class of groups 
which is invariant, non-trivial, and hereditary. Then there is no algorithm by 
means of which one can determine given a group presentation, whether or not 
the group belongs to G. 


As an example, let G consist only of the trivial group {e}. We obtain: 


3.8. CoROLLARY. There is no algorithm by means of which one can tell from 
a group presentation, whether or not the group contains at least one element 
other than e. 


Other group properties to which the Adjan—Rabin theorem applies are 
that of being: 
(i) cyclic, 
(ii) finite, 
(iti) free, 
(iv) commutative, 
(v) solvable. 

The proof of the Adjan—Rabin theorem makes use of the existence of a 
group presentation with an unsolvable word problem, i.e. of the 
Novikov—Boone theorem. 

By means of a straightforward construction (HAKEN [1973]), it is possible 
to construct from a given group presentation of a group G a 2-dimensional 
simplicial complex having G as its fundamental group. Since a simplicial 
complex is simply connected just in case its fundamental group is trivial, 
i.e., consists of the identity element, we can conclude from Corollary 3.8: 


3.9. COROLLARY. There is no algorithm for testing a given 2-dimensional 
simplicial complex to determine whether or not it is simply connected. 


By using a much more elaborate and difficult construction, MARKOV 
[1958] showed how given a group presentation I it is possible to construct 
for each n=4 a pair of simplicial complexes M,(IT), M2(I1) which are 
n-dimensional manifolds such that M,(/1) and M,(IT) are homeomorphic if 
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and only if the group of which I] is a presentation is trivial. Again using 
Corollary 3.8, we can conclude: 


3.10. THEOREM. The homeomorphy problem for manifolds of dimension 
= 4 is unsolvable. 


The solution of the homeomorphy problem for two-dimensional man- 
ifolds is a classical result going back to Riemann. For three-dimensional 
manifolds, it remains open. For further discussion of topological decision 
problems, cf. Boone, HAKEN and PoéNARu [1968] and HaKEN [1973]. 


4. Other combinatorial problems 


One of the earliest examples of an unsolvable problem of a purely 
combinatorial nature was given in Post [1946], the so-called correspondence 
problem. Later this problem turned out to have interesting applications to 
the formal theory of languages. 


4.1. DEFINITION. A Post correspondence system consists of an alphabet A 
and a finite set of ordered pairs (h,, k;), i= 1,...,m, where a word u on A 
is called a solution of the system if for some sequence 1 = ij, i2,...,i, =m 
(the i,’s need not be distinct), we have 


u=hihi-+ +h, = kikgs +> k; 


in in? 


Let IT be a given semi-Thue process, and let u, v be a pair of words in the 
alphabet of I. We shall show how to construct from JT together with u and 
v a Post correspondence system which has a solution if and only if u >jv. 
Using Corollary 2.6, we will then be able to conclude: 


4.2. THEOREM. There is no algorithm for determing of a given Post corre- 
spondence system whether or not it has a solution. 


Proor. The proof of 4.2 given here is that of Floyd (unpublished), and is 
considerably simpler than Post’s original proof. It can also be found in 
YASUHARA [1971]. 

We proceed with the required construction. Let IT be a semi-Thue 
process on the alphabet A = {a,,...,a,}, and let u, v be words on A. We 
shall construct a Post correspondence problem P on the alphabet 
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B = {a,,...,@n, at,..., an [, ],*, *'}, 

consisting of 2n + 4 symbols. For any word w on A we write w’ for the 

word obtained from w by placing ' after each symbol of w. 

Let the productions of IT be g; > g,, i = 1,2,..., k and let us assume that 
included among these productions are the n “‘identity’’ productions 
a; > a, i= 1,2,..., n. This last assumption does not restrict generality (i.e. 
the class of pairs (r,s) for which r>%;s is in no way changed by including 
the identity productions). However, we may now state that u>fv if and 
only if we can write 


where m is odd. The Post correspondence system P on the alphabet B is 
then to consist of the pairs: 
({u*,[), (*,*), ¢#,*), 
(8 81) 
j=1,2,...,k, (J, *’v]). 
{8}, 8) 
Now, let 


mein Uz Des D Un =v 
mM 


ti tf 
where m is odd. Then the word 
w = [U,* U3 * U3 FU * Um] 
is a solution of P as is obvious from the two decompositions 
w= [ur*|u2*’/us*[---]] 
= [Jur*|us*’| +++ [* Um]. 


To see for example that u3*' corresponds to u, * we note that we can write 
ui =rgs, U=rgs for some j =1,2,...,k. Then ui=r'gis' and the 
correspondence is obvious (recalling that the productions a,— a, 
.+,;@, > @, are present in JT). 
Conversely, if W is a solution, then to avoid mismatches on the extreme 
left and right (between primed and unprimed symbols), w must begin [ and 
end ]. Hence, letting w be the portion of w up to the first J, 


w=[u*-->*' vl, 


where to begin with we have the correspondences 
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w= [we] of], 
w=[[w*---|* ov]. 


Hence we must have u * corresponding to some r’*’ and *'v to some 
*s', where u>%r and s>fv. Continuing this procedure, we see 
that u>%v. This completes the construction and hence the proof of 
Theorem 4.2. 0 


In the formal theory of languages one deals with alphabets A = VUT 
where the elements of V (usually written as captital letters) are called 
variables and those of T are called terminals. If one of the elements of V 
(usually written S) is distinguished as the start symbol a semi-Thue process 
IT on A is called a phrase structure grammar, and the set of all words v, on 
the sub-alphabet T, such that S >i v is called the language generated by I], 
written L(/T). 

Intuitively one can think of the variables as representing grammatical 
categories. For example, let V = {S, U, P, N, J, R, D} where we may think 
of these letters as representing the categories: sentence, subject, predicate, 
noun, adjective, verb, adverb, respectively. Let T consist of the (lower 
case) letters of the English alphabet augmented by “‘ #”’ (for ‘‘space’’) and 
‘““.”. Here is a primitive English grammar G: 


N— man 


N— woman 


S— UP. 

J — good 
U-the #N+# 

J — bad 
U-the #J#N# 

R- runs 
P—>R 

R => walks 
P—>R + D. 

D — quickly 

D — slowly. 


The reader can easily verify that 
* 
S > the+bad+man#runs#quickly 
G 


so that this last is in the language generated by G. 
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4.3. DeFinirton. A grammar is called context-free if all of its productions 
are of the form A —g where A is a variable. 


Thus our example G is context-free. We shall now show how to relate 
Post's correspondence problem to context-free grammars. Thus, let (h,, k;), 
i=1,2,...,m be a Post correspondence system on an alphabet A. We 
shall construct a pair G,, G2 of context-free languages whose terminals 
consist of A together with m new symbols c,,..., Cm- 

In addition, the alphabet of G, has the single variable S, as start symbol 
and that of G, the single variable S, as its start symbol. The productions of 


G, are 
S,> hiSic 
i=1,2,...,m, 
S,—2 hic; 


and those of G, are 


S2— kS2¢; 
i=1,2,...,m. 
S2.—> kic; 
Thus, 
L(G,) = {hihi hinCin °° * CnC} 
and 


L(G2) = {ki kins + KinCim °° * CigCi}- 


We have at once: 


4.4. Lemma. The given Post correspondence system has a solution if and 
only if L(G:) 1 L(G2) 4 @. 


We conclude from Theorem 4.2: 


4.5. THEOREM. There is no algorithm to test for a given pair of context-free 
grammars II,, [1, whether or not L(I7,)N LU1:) = 6. 


4.6. DEFINITION. A phrase structure grammar [7 with start symbol S is 
called ambiguous if for some word u € L(/T), we have 


and 
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S>vu>nD Dv=u 
il ‘fi if; i 
where the sequences (u,,...,u,) and (v,,...,v;) are not the same. 


Now, let G,, G, be as above. Let G have the same terminals as G, and 
G2, the variables S,S,,S. (with S as start symbol), and the productions 
S— S,, S > S, together with the productions of G, and G2. Then, clearly, 
L(G) = L(G,) UL(G,). Because G, and G; are themselves not ambiguous, 
it is easy to see that G is ambiguous just in case there is a word u which 
simultaneously belongs to L(G,) and to L(G). 

Using the proof of Theorem 4.5, we have: 


4.7. THEOREM. There is no algorithm by means of which we can test a given 
context-free grammar to determine whether or not it is ambiguous. 


For other examples of unsolvable problems in the formal theory of 
languages and further references, see Hopcrorr and ULLMAN [1969]. In 
PATERSON [1970] it is shown, using the unsolvability of Post’s correspon- 
dence problem, that there is no algorithm which can be used to determine 
of a given finite set of 3 x 3 matrices with integer entries, whether or not 
some finite product of members of the set is equal to the zero matrix. 

Another combinatorial problem which has turned out to be unsolvable is 
the problem of tag, first studied by E. L. Post in the 1920’s. Cf. Post [1943, 
1965]. A tag process is given by an alphabet A =4a,,...,a,, n words 
81,---,8, and a positive integer k. For such a tag process T, we write 
u>rov if u begins with a, and v is obtained from u by placing g; at its 
right-hand end and then crossing out the first k symbols of u. (Of course, it 
is possible that v = A.) As before, we write u>}v to mean that u = 
U>ru,D>r s+’ >run = v for words u,,...,u, and some n = 1, and speak 
of the word problem for T as the problem of determining for given u, v 
whether or not u > 7 v. Minsky [1961, 1967] showed how to construct a tag 
process with an unsolvable word problem. 


5. Diophantine equations 


We have already mentioned Hilbert’s tenth problem (cf. the Introduc- 
tion), which sought an algorithm for testing polynomial equations 
P(x,,...,X,) =O (with integer coefficients) for possession of a solution in 
integers. In this section it will be more convenient to work with non- 
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negative integer solutions, i.e. solutions in N. We therefore begin by noting 
that once it has been established that there is no algorithm for testing 
polynomial equations for possession of solutions in N, it will follow at once 
that there can be no algorithm which can test for arbitrary integer 
solutions. For, if we had an algorithm which could test for integer solutions 
(i.e. a solution to Hilbert’s tenth problem), we could use it to decide 
whether or not P(x,,...,x,)=0 has a solution in N by simply testing 


P(ui+ vit yit Z1,..-,Unt Oat Yat Zn) =0 


for having an integer solution. (This works because every non-negative 
integer is the sum of 4 squares.) 
Our discussion will revolve about the notion of Diophantine set: 


5.1. Derinition. A k-ary relation R on N is Diophantine if for some 
polynomial P(x1,..., Xx, Yi,---, Ya) With integer coefficients, we have: 


R = {(x1,..65 Xe): Dt, +++) Yn E N[ P(X, ..- 5 Xky Vip +++ Yn) = OFf. 


Let R be a Diophantine relation and let gz: N“ >N be defined by 
Br(X1,..., Xe) =O if (x1,...,%)E R and gr(x1,..., Xx) is undefined other- 
wise. If R is connected to the polynomial P as in 5.1, an algorithm for 
computing the partial function gr is to effectively arrange all n-tuples 
{y1,..-,Yn) in a suitable sequence and to successively evaluate 
P(x1,...,%k, Yiy--+> Yn) until (if ever) the value 0 is obtained. Using 
Church’s thesis, we conclude that ge is a recursive partial function, and 
hence that there is a Turing machine which computes gr. Hence, by 
Definition 1.3, we conclude that R is recursively enumerable: 


5.2. THEOREM. Every Diophantine relation is r.e. 
The main theorem of the subject is the converse: 
§.3. THEOREM (MATUACEVIC [1970]). Every r.e. relation is Diophantine. 


A detailed proof of Theorem 5.3 and a historical discussion can be found 
in the survey article Davis [1973] (in which, however, recursive functions 
are not defined in terms of Turing machines), and various of its ramifica- 
tions are discussed in Davis, MATUACEVIC and RoBINson [1976]. We also 
refer the reader to these papers for further references. 

Before discussing the proof of Theorem 5.3, we consider some of its 
consequences. Using Theorems 1.5 and 5.3 we have: 
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5.4. CoroLLary. There is a Diophantine set K which is not recursive. 


Writing 
K ={x: By.,..., yn [P(x yi,---, Yn) = OF} 
we see that there can be no algorithm for testing each of the equations: 
P(0, yi,..-, Yn) = 9, 
P(1,yi,..-, yn) = 0, 


for having a solution in N. Hence Hilbert’s tenth problem is unsolvable. 

For any polynomial P(x,,...,%) let us write #(P) for the cardinal 
number of the set of k-tuples of elements of N which satisfy the equation 
P=0. Thus 0S #(P)=No. Let us call a set of cardinal numbers = Np 
non-trivial if the set is non-empty and it does not contain all cardinal 
numbers = Np. Then, we shall prove: 


§.5. THEOREM (Davis [1972]). Given a non-trivial set A of cardinal num - 
bers = No, there is no algorithm which can be used to test a given polynomial 
P for whether or not #(P)EA. 


Thus, there is no algorithm to test whether a Diophantine equation has 
an even number of solutions, an infinite number of solutions, etc. The 
proof we give is a simplified version due to Smorynski. (Cf. also Davis, 
MarTWACEvic and Rosinson [1976].) 

Now, for each polynomial P(x,,..., Xx.) we write 


T'(P) = P(x1,..-4 Xe) * [Qtr — iP +++ + (te — ae’), 
where (a,,..., a) is the first (in any convenient ordering) k-tuple which 
fails to satisfy P = 0. Thus 
#(T'(P))= #(P)+1. 
Setting, T°(P)= P, T"*'(P)= T*(T"(P)), we have: 


#(T"(P)= #(P)+m. 
Finally we write 
T°(P)=(u+1)-P 


where u is a variable which does not occur in P. Here 
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0 if #(P)=0, 


+ (T°(P)) = 
N, if #(P)40. 


The proof of 5.5 proceeds by cases: 
Case 1. A = {0}. This is just the question of testing a Diophantine 
equation for possession of a solution which we have seen to be unsolvable. 
Case 2. 0,N,€ A. Let m € A. Then, 


# (P)=004#(T"(T(P)) ZA. 


Hence an algorithm for this case would yield one for case 1. 
Case 3. OG A, No € A. Then, 


#(P)=0< #(T*(P))EA 


and once again we have a reduction to case 1. 

Case 4. 0 € A. Then the set B of cardinal numbers <W, not in A is in 
case 2 or case 3. But an algorithm for testing #(P)€A is equivalent to 
one for testing +#(P)€ B. 

Using the normal form theorem (ERT 4.1), i.e. the existence of a 
universal Turing machine, we infer the existence of a Diophantine equa- 
tion P(i, x, y:,..., yx) =0 which is universal in the sense that, setting 


D, = {x: Ayi,..., ye [P(i, x, yi,..-, Ye) = OF} 


the sequence Do, D,, D2,... is a list of all Diophantine subsets of N. There 
has been considerable interest in finding the best value of k in this result. 
In MatiyAcevic and RosInson [1975], it is shown that we can take k = 13; 
more recently Matijacevi¢ has announced that this result can be improved 
tok =9. 

This is an appropriate place to discuss an unsolvable decision problem 
that was obtained quite early in the history of the subject (CHURCH [1936]; 
TurIinG [1936]), the unsolvability of the problem determining of for a finite 
set of sentences T and a sentence ¢ of first order logic, whether or not 
TI @ in first order logic, cf. the introductory Chapter A.1. Now, Hilbert had 
declared that this very problem, the so-called Entscheidungsproblem, was 
the fundamental problem of mathematical logic, because a procedure for 
solving it could presumably be used to algorithmically settle all mathemati- 
cal questions. (See the discussion of Hilbert’s thesis in Section 5 of Chapter 
A.1.) Hence once one knows that there are unsolvable mathematical 
problems, one should expect to be able to readily infer the unsolvability of 
the Entscheidungsproblem. We shall see how this can be done easily using 
the unsolvability of Hilbert’s tenth problem. 
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Let us consider the language 
L={+,-,0, 1} 


where 0, 1 are constant symbols and + ,- are 2-ary function symbols. Here 
the structure N = (N, +,-,0,1) is a structure for L. Now, given a Diophan- 
tine equation P = 0, it is easy to construct a sentence @ of L of the form 
Sy,-++ dy, (t) = t) where ¢t, and ¢, are terms, such that NE ¢ if and only if 
the equation P = 0 is solvable. Let T be the set of sentences 


(0+ 1)=1) 
Vx ((x + 0)= x) 
Vx Vy (x +(y + 1))=(x+y)+1) 
Vx ((x +0) =0) 
Vx Vy ((x-(y + I) = (xy) + x)). 


Then it is clear that for any equation @ between terms of L containing no 
variables, NE @ if and only if T+ 0, because T obviously suffices to justify 
specific calculations in N. Finally using the appropriate rule of first order 
logic for introducing existential quantifiers we have: 


Ttq@_ if and only if P=0 has a solution. 


From the unsolvability of Hilbert’s tenth problem, we then immediately 
conclude that the Entscheidungsproblem for L is unsolvable. For sharper 
results see KaHR, Moore and Wana [1962] and AANDERAA and Lewis 
[1974]. 

We now discuss the proof of Theorem 5.3. Clearly, since 


P,=0n P,=0¢P{+ P3=0, 
P,=0v P,=0<P,: P,=0, 


the class of Diphantine relations is closed under union and intersection. 
Hence Diophantine ‘‘definitions” can be combined using a, v and existen- 
tial quantifiers to yield new definitions of Diophantine relations. Also the 
relations x < y, x = y, x =y modz are clearly Diophantine since 


x<yeodz(xt+zt+l=y) 
xSyodz(x+z=y) 
x=ymodzedt(xtiz=yvyttz =x). 


The steps in the proof of Theorem 5.3 are as follows: 


CH. C.2, §5] DIOPHANTINE EQUATIONS 589 


(i) The relation z = x” is Diophantine. This key step came last histori- 
cally. We shall not give a proof here but refer the reader to any of the 
following for a detailed proof: MATUAcEvi¢ [1972], Davis [1971, 1973], 
MatTuAcevic and RoBINsoN [1975]. 

(ii) The relations m=(j), m=n! are Diophantine. Let us write 
@(k, u, w) for the coefficient of u“ in the u-ary expansion of w. Then the 
relation 


q = B(k, u,w) 


is a Diophantine relation. For, the relation can be expressed as the 
simultaneous solvability in N of the conditions: 


w=p+qu* +r, 
q<4, p<u', 


r=0 modu**'. 


Now if u > 2", the expansion 


(u+1)" = >) u' 


shows that (z) is just the coefficient of u“ in the u-ary expansion of (u + 1)”. 
Hence 


ae (7) om = &(k, 2" +1,(u+1)") 


so that m = (g) is Diophantine. 
Elementary inequalities show that when r > (2n) 


n+l 


we have 


n! aes nt+l. 
4, 
(For details see for example Davis [1973].) Hence, 
m=nlerr3s tau {s=2n+1 &r=s' Meas" &u = (7) 


& um <t<u(m +1)} 


(iit) The bounded quantifier theorem. Let P(u,x, y,h,Z1,...,Z,) be a 
polynomial. Then, there is a polynomial Q(u, x, y) such that 
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Vk =x 4z,...,2, Sy [P(u,x, y,k,21,...,2,) =O] 


b, ee es b, 
oo3b soba (2) = a( i) = 


= 2 Q!-1 
P(u, x, y,Q! 1, by,.--,b.) 0 mod (21> ')]. 


This is an improved version of a result from Davis, PUTNAM and ROBINSON 
[1961]. For its proof see Davis, MATUACEVIC and RoBINson [1976], p. 21. 
(iv) We now have: 


5.6. THEOREM. Let R be a Diophantine relation and let 
(Xi. Xe XE S OVW Sx [(x1,..., x, 0) E R]. 


Then S is Diophantine. 


Proor. We have for a suitable polynomial P, 
(X1,...,%e HVE SOWt Sx AzZ1,..., Zn [P(t x1, ..., Xk Zt,-- +) Zn) =O] 
edyVtsx 
Az1,...,2, Sy[P(t, x1, -.., Xn, Z1,-.-5 Zn) = OJ 


since the bound y can be chosen as the largest of n(x + 1) integers. The 
result follows from (ii) and (iii). O 


(v) Now let P be an arbitrary k-ary r.e. relation. Let T be a Turing 
machine with instructions qo,...,q, such that T eventually halts when 
started at qo scanning the leftmost symbol of 'x,'0'x,!0---0'x,! if and only 
if (x1, X2,...,X%) © P. Let T’ be obtained from T by adjoining to its table all 
quintuples: 


ia aR (v+1) 


where 0 =i = v and the table of T has no quintuple beginning ia. Thus 
(x1,...,X«) © P if and only if T’ eventually encounters instruction q,., (and 
then halts) when started at q scanning the leftmost symbol of 
"x! 0'x,'0---0'x,. 

We shall code configurations of T’ by triples (J, n, r) of natural numbers. 
Here n is simply the number of the instruction about to be executed and | 
is a code for the sequence of 0’s and 1’s on the tape to the left of the 
scanned square; specifically / is just the natural number whose binary 
representation is this sequence (where of course the infinite sequence of 0’s 
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on the left has no effect on /). Similarly r is the natural number whose 
binary representation is the sequence of 0’s and 1’s beginning with and to 
the right of the scanned square, read from right to left. Thus, for example, 
the tuple (13,4, 19) represents a machine configuration in which the tape 


contents are 
...000110111001000..., 


the scanned square contains 1 as do its left and right immediate neighbors, 
and instruction q, is about to be executed. 


The initial configuration of T’ with inputs x,,..., x. is represented by the 
triple (0,0, g.(x1,..-,%.)), where 
gi(x)=2"*= 1 


Buai(X1, or) Xa) = 2 gx (X2, ed 0eg Xai) + 21(x1). 


Here, as we verify by an easy induction, the relation z = gy (x1,..., Xx) is 
Diophantine for each fixed k. 

For Q a quintuple from the table of T’, let So be the set of all 
(Ln,rl',n',r') such that Q acts on configuration (l,n,r) to produce 
configuration (/’, n', r’). For each fixed quintuple Q, So is a Diophantine 
relation as we proceed to show. 

For, if Q has the form 

ia BRj, 
then 
(antl ny rE Sgaen=iann'=jal'=21+Bar=2r't+a. 


On the other hand, if Q has the form 
ia BLi, 
then 
(anl n,rYESge 


eon=inn=jr[(l=2l'ar'=2r+B-a))v 
(1=2I'+1ar'=2(r+ B-a)+1)}. 


Let Q:, Q2,..., Q, bea list of all of the quintuples from the machine table 
of T’ and let S= Shae So, Thus, S is a Diophantine relation and 
(lL n,r,l',n',r') ES just in case some quintuple of T’ acts on configuration 
(in, r) to produce configuration (I', n’,r’). 

Now a computation of T’ is a sequence of configurations (I, n, 1), 
i=0,1,...,k +1 such that (4, n,n, boi, nisi, oi) E S for i =0,1,...,k and 
n+ = v +1. We shall regard the numbers J, n,, 7, as digits to a sufficiently 
large base B, i.e. we set 
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k k k 
L=D 1B, N= 0B, R=> 1B. 


% f= 0 
Using the function ® from (ii), we have finally, 
(xX1-..,X EPO 
@3B,L,N,R,k {@(0, B, L)=O0n (0, B,N)=0¢ 
®(0,B,R)= ex(x1,...,u)AP(K+1,B,Ny=vt a 
Visk[(®Ui, B, L), (i, BLN), OG, B, R), 
@(i+1,B,L), ®(i +1, B,N), 
@®(i +1, B, R))E S}}. 


Using Theorem 5.6, it follows that P is Diophantine. 
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Survey and basic notions 


The study of decidability involves trying to establish, for a given 
mathematical theory T, or a given problem P, the existence of a decision 
algorithm AL which will accomplish the following task. Given a sentence A 
expressed in the language of T, the algorithm AL will determine whether A 
is true in T, i.e. whether A € T. In the case of a problem P, given an 
instance I of the problem P, the algorithm AL will produce the correct 
answer for this instance. Depending on the problem P, the answer may be 
‘“‘yes’”’ or ‘“‘no’’, an integer, etc. 

If such an algorithm does exist, then we shall variously say that the 
decision problem of T or P is solvable, or that the theory T is decidable, or 
simply that the problem P is solvable. Of AL we shall say that it is a 
decision procedure for T or P. Let us illustrate our concepts by two 
celebrated decidability results. 

Let L be a first-order language appropriate for expressing statements 
about planar Euclidean Geometry. Thus L has individual variables ranging 
over points; two ternary predicates L(x,y,z) and B(x, y,z) to denote 
colinearity and betweenness; two predicates C(x,y,z,u,v,w) and 
A(x, y, Z,u, v,w) to denote congruence of triangles and congruence of 
angles (i.e. Axyz = Auvw and {xyz = Luvw); and a quaternary predicate 
E(x, y,u, v) to denote equality of length (i.e. xy = uv). The formula 


Vxyzuv [A (x, y, ¥, Z, y, 0) A A(x, z, 4, y,z,u)A B(x, y,u) A 
B(x, z,v) A E(u, z, v, y)— E(x, y, x, z)], 


for example, is an expression in L of the famous high school problem to the 
effect that if in a triangle the angle hisectors are equal, then the triangle is 
isosceles. 

TARSKI [1951] has proved that the theory EG consisting of all sentences 
of L true in planar Euclidean Geometry, the so-called elementary 
geometry, is decidable. 

Actually Tarski proved a stronger result. Let & = (R, +,-) be the field 
of real numbers, then the first-order theory Th(%) is decidable. This last 
result implies the decidability of elementary geometry via the introduction 
of Cartesian coordinates and the reduction of geometric statements to 
equivalent algebraic statements. 

Our second example is also related to Euclid. The problem GCD 
consists of finding for pairs a, b of natural numbers their greatest common 
divisor (a, b). A slight variant of the famous Euclid’s algorithm is based on 
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the facts that (a,0)= a and, for a =b, (a,b)=(a,b-— a). A succession 
of steps of the second type will transform any g.c.d. (a,b) into (c,0), so 
that (a,b)=c. Thus (27, 15) = (12, 15) = (12, 3) = (9,3) = (6, 3) = (3,3) = 
(3, 0) = 3. 

The assignment of a precise mathematical meaning to decidability 
involves the notion of a computable or recursive function. By an appro- 
priate Gédel numbering G, the set of all sentences of a language L is 1-1 
mapped onto a recursive subset S C N of the set N of natural numbers. This 
transforms T into a set G(T) C S. The theory T is, by definition, decidable 
if and only if G(T) is a recursive set, i.e. its characteristic function fr is 
computable. Solving the decision problem of T involves producing a 
program or algorithm for computing f;, or at least proving that fr is 
computable. The notion of solvability of a problem P is explicated in a 
similar fashion. 

As explained elsewhere in this volume, the theory T is defined to be 
undecidable if fr is not recursive. 

There is a significant methodological difference between the study of 
decidability and the study of undecidability, and this despite the obvious 
' fact that the two concepts are just the opposite sides of the same coin. 
Attempts to establish the undecidability of a theory T must presuppose a 
mathematically precise notion of computable functions. For only after we 
know what a computable function is, can we prove that f; is not 
computable. 

On the other hand, the decision problem of a theory T can be solved by 
exhibiting a decision algorithm AL which is directly recognized and 
accepted by mathematicians as being an effective computational proce- 
dure. Thus, for example, Euclid’s algorithm given above, when sup- 
plemented by explicit rules for comparison and subtraction of natural 
numbers, is universally agreed upon as constituting a computational 
method. 

This state of affairs accounts for the historical fact that some decidability 
results preceded the definition of recursive functions in the mid-Thirties, 
whereas the first undecidability results only followed the formulation of 
this definition. 

There are three main methods for establishing decidability of theories. 
The first and oldest is elimination of quantifiers. This method consists of 
transforming the given sentence A into another sentence B such that 
TtA<B and B belongs to a class # of sentence for the members of 
which we can directly determine whether they are in T. 

The second method is model-theoretic. In its typical form it involves a 
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(recursive) set of axioms AX for the theory T. Model-theoretic methods 
are employed to either show that AX is complete, in which case T is 
readily seen to be decidable, or to systematically survey all completions of 
AX. In the latter case we shall know for each sentence A whether 
T U{A} is consistent and consequently whether Tt A. 

In certain cases a combination of the two methods is used. Namely a set 
HX of sentences of L is judiciously chosen, the fact that for every sentence A 
of L there exists a sentence B € X equivalent to A by T is then established 
by model-theoretic means. Also the sentences B € # true in T are picked 
out by a survey of models of T. 

The third method involves interpretations. Let T) be a theory known to 
be decidable, and let T be any theory. Assume that we have a (comput- 
able) map ¢ which transforms or translates each sentence A of the 
language L into a sentence t(A) of the language L, so that t(A) © Tp if and 
only if A © T. Under these conditions T is decidable. For in order to 
determine whether A € T we just find t(A ) and check whether ¢(A) € To. 
Usually the interpretation t involves model-theoretic considerations. It is 
shown that models of T can be isomorphically reproduced from models of 
To by relations definable in Ly. This method was used in RABIN [1969] to 
establish most of the then known decidability results as well as several new 
ones. 

The method of interpretations is, mutatis-mutandis, also a powerful tool 
for proofs of undecidability. For if the theory T is known to be undecid- 
able, and is interpretable in T) in the manner of the previous paragraph, 
then 7, must be undecidable, see RABIN [1965]. 

The study of decidability should be viewed as a component, or natural 
outgrowth, of Hilbert’s Program for the foundations of mathematics. 
Hilbert envisioned a codification of the various branches of mathematics by 
systems of axioms, with an axiomatized logic serving as a common basis for 
deduction of consequences (theorems) from the axioms. Hilbert hoped that 
such a formalization would turn the derivation of mathematical results into 
a mechanical game with strings of symbols. According to Hilbert’s plan, 
this would give us such a comprehensive survey of all formal theorems 
within any mathematical discipline, that we would be able to demonstrate 
that no formal statement and its negation are jointly provable, thereby 
demonstrating the consistency of mathematics. Also implied by Hilbert’s 
Programme is the belief that the process of theorem-proving is mechaniz- 
able or, in modern parlance, that mathematical theories are decidable. 
Failing to implement Hilbert’s plan for mathematics as a whole, by proving 
the consistency and decidability of, say, set theory, researchers turned their 
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attention to more restricted segments of mathematics. Many of the 
decidability results which we shall discuss later on, such as Presburger’s 
decision method for the theory of addition of natural numbers, were 
obtained in the Twenties and early Thirties and were motivated by 
Hilbert’s plan. 

Gédel’s celebrated Incompleteness Theorems and Church’s Undecid- 
ability result, dating back to the early and mid-Thirties, dashed the hopes 
for realization of Hilbert’s Programme in its original form. Namely, Gédel 
demonstrated the impossibility of proving the consistency of any appreci- 
able portion of mathematics by the finitist methods advocated by Hilbert. 
And Church proved that the predicate calculus, as well as the arithmetic of 
addition and multiplication of natural numbers, are undecidable. These 
results put into perspective the study of decidability and engendered a 
considerable body of research into the decidability and undecidability of 
various mathematical theories. 

Only in recent years attention turned to the issue of the computational 
complexity of solvable decision problems. In the spirit of Hilbert’s Pro- 
gramme and of Turing’s analysis of computability, it was tacitly assumed 
that for a theory T proved decidable, the question whether a given’ 
sentence is a theorem of T is a trivial one. For one needs only to 
mechanically apply the decision procedure in order to answer any such 
question. No creative or intelligent thinking is required for this process. 
From this point of view, any decidable theory is trivial and uninteresting. 
Work of Fischer, Meyer, Rabin, and others has caused a reevaluation of 
this attitude. They have shown that many theories, even though decidable, 
are from the practical point of view undecidable because any decision 
algorithm would require a practically impossible number of computation 
steps. For the arithmetic of addition of natural numbers, proved decidable 
by Presburger, FiscHER and Rasin [1974] have proved that for every 
decision algorithm AL there exist sentences A of size (i.e. number of 
symbols) n such that AL requires 2” computational steps to decide A. 
MeyYeER [1975] has proved even more devastating complexity results 
for theories such as the theory of linear-order. Fischer and Rabin have also 
shown that Elementary Geometry has a decision problem which is 
inherently of exponential complexity. Results such as these cast doubt on 
the assertion that any theory proved decidable is trivial because its 
theorems could be checked by a computer program. Computations involv- 
ing, say, 2?” steps cannot be considered as a feasible method for establish- 
ing the truth of a mathematical statement. 

Are there any theories with a practically solvable decision problem? 
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Startlingly enough the answer to this fundamental question is as yet 
unknown. It is readily seen that the decision problem of any formalized 
theory is at least as complex as the decision problem PC of the proposi- 
tional calculus. Now Cook [1971] has observed that many combinatorial 
decision problems which have defied attempts at producing an efficient, not 
exponentially complex, decision procedure, are reducible to PC. This lends 
credence to the conjecture that PC is exponentially complex. Cook has 
related this question, via a use of Turing machines, to the question whether 
non-deterministic computations requiring a number of steps polynomial in 
the problem size, are always equivalent to ordinary deterministic computa- 
tions requiring a polynomial number of steps. This so-called P = NP 
problem is, as of the time of writing of this paper, one of the central open 
questions in theoretical computer science. It also relates to the older 
“spectrum problem” concerning models of sentences of the predicate 
calculus. Details will be given in the text. 

Since the study of decidability involves methods from propositional and 
predicate logic, theory of computable functions, and theory of models, we 
shall have to rely on these prerequisites in our exposition. In most cases 
only the rudiments of these subjects will be required for following 
discussions. The uninitiated reader is urged to consult the relevant chapters 
of this book for any auxiliary information that he may need. 


1. The method of elimination of quantifiers 


1.1. Theories and models 

The theories dealt with in the study of decidability will present them- 
selves in one of two ways: axiomatically or semantically, as the set of 
sentences true in a structure or class of structures. Usually we shall 
consider first-order theories, i.e. theories expressible in some first-order 
predicate language. This rule will, however, have some very important 
exceptions. 


DEFINITION 1. Let L be some fixed first-order language and let H be a 
recursive consistent set of sentences of L. The theory axiomatized by H is, 
by definition, the set Th(H) of all logical consequences of H 


Th(H) ={A|Ht A}. 


cu. C.3, §1] ELIMINATION OF QUANTIFIERS 601 


The theory Th(H) and the set of axioms H are called complete if for 
every sentence A of L we have A € Th(H) or 7A € Th(A). 


THEOREM 1. If the theory T is axiomatizable and complete then T is 
decidable. 


Proor. Let H be an axiomatization of T. The sequences S; = 
(Ai, Aiz,..., Ain), Which are formal proofs from the axioms H, can be 
effectively enumerated. This can be done, for example, by enumerating ail 
finite sequences (words) on the alphabet of L and deleting those sequences 
which are not proofs. The last members A,,, of the proofs enumerated run 
through all statements A which are provable from H, i.e. through all 
elements of Th(H) = T. 

Thus, the theorems of T can be effectively (computationally) enumer- 
ated in a sequence S = (A,, Az,...). Let now A be any sentence of L. Start 
enumerating S and for each A, obtained check whether A, =A or 
whether A, = —A. Since T is complete, one of the two alternatives must 
eventually occur, at which time we shall know whether A €T or 
A€T. O 


People seeing the above argument for the first time often encounter 
some difficulty in convincing themselves that the proposed procedure is a 
valid computational process for deciding T. This is because an essential 
feature of a computation is that we are sure it will terminate, while here 
when given A we have no a-priori bound on the number of steps required 
before the computation terminates. However, the fact that T is complete 
ensures that the computation will terminate by either A or 7A being 
encountered in the enumeration. This constitutes a proof of termination of 
the algorithm. In fact, the number-of-steps function is thereby shown to be 
a calculable function, albeit not of the commonly encountered variety such 
as n”. 

The idea underlying Theorem 1 can be extended to cases where the 
theory T is not complete. 


THEOREM 2. Let T be axiomatizable and assume that there exists a recursive 
(computable) sequence A,, A2,..., of sentences satisfying the following 
conditions. 

(1) T U{A,} is consistent for every n. 

(2) Every completion T C T, of T has a (not necessarily recursive) set of 
axioms B ={B,,..., By,...} such that T;=Th(B), and for every k there 
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exists ann for which Tt A, @ B, A--+ A By. Conditions (1)}-(2) imply that T 
is decidable. 


Proor. Let H bea recursive axiomatization for T. By a process resembling 
dovetailing we can computationally enumerate in one sequence 
D,, D2,..., the logical consequences of all the sets H, = H U{A,}, n= 
1,2,.... Namely, start with an effective enumeration of the consequences 
of H, as in the proof of Theorem 1. When the first consequence H,+ D, is 
encountered, start enumerating consequences of H; until the first one, D2, 
is obtained. Now return to generate the second consequence, call it Ds, of 
H,. Then return to H, to obtain H,+D,, and next obtain the first 
consequence, call it D;, of H3; and so on. 

Again dovetailing, effectively enumerate the above sequence 
D,,...,D,,..., and also the sequence E,,..., E,,..., of all consequences 
of H. The second sequence is, in fact, an enumeration of T = Th(H). Thus, 
if A € T then A will appear among the E,’s. We claim that if A ¢ T, then 
7A will appear among the D,’s. Thus the double enumeration will 
computationally yield an answer to the question whether A € T. 

To establish the claim, note that if A ¢ T, then T U{—A} is consistent. 
Hence it is a subset of a complete theory TU{—A}CT;. Let B= 
{B,, B2,...} be the set of axioms for T; mentioned in the assumptions on 
the sequence A,,A2,.... Then Bl—A and consequently for some 
integer k, B, 1---* B, +} A. By our assumptions there exits an n so that 
THA, @B, A+++ B,. Hence TU{A,}+— A, and A appears in the 
sequence D,, Dz,.... O 


Theorem 2 is used to establish decidability in cases where T is not 
complete and yet we can somehow survey all possible completions of T. 


DEFINITION 2. Let %& = (A, R,, R2,...) be a structure and L a first-order 
language appropriate for 2; that is, L has a symbol P, corresponding to any 
relation or function R; of 2%. The theory Th(2) of Y is the set of sentences 
of L true in 


Th) ={B | Ae B}. 


If ¥ is a class of structures all of the same type, and L is a language 
appropriate to one and hence all structures in # then by definition, 


Th(H) = A Th(2). 
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We shall now give examples of theories defined by axioms as well as 


examples of theories defined by models, i.e. defined semantically. 


ExampLe 1. Let L be a language with just one binary predicate symbol = 
(greater or equal than). Consider the set of axioms OR: 


1. VxVy[xsyaysx—-x=y], 
2: VxVy[xsyvy <x], 
3.. VxVyVz[xsyaysz—>xSz]. 


Any model (A, <)FOR is a totally ordered set. 
Let us introduce the abbreviation x < y to stand for x Syn x=y. 


EXAMPLE 2. Consider the axioms DO obtained by adding to OR the 
axioms 


4, 3x Vy [ysx—-x=y], 
5. Vx dy [x<yaVz[x<z—->y<=z}], 
6. Vx Vy [y <x 3z Vw [z<xa[z<w—oxswi]]]. 


Any model (A, =)FDO is a totally ordered set which has a first 
element, every element has a unique immediate successor, and every 
element except the first element has a unique predecessor. Such orders will 
be called discrete orders. 


Denoting, as usual, w = {0,1,2,...} we see that J? = (w, =) DO. If we 
denote by w* the reverse order-type of w (i.e.0>1>2>---), then every 
ordered set which is a model of DO has order type w + (w* + w)A, where A 
is any order type. 


EXxaAmPLe 3. Consider the class #,y; of all structures (A,f) where f is a 
unary function f: A — A. Th(X,,) is the theory of a unary function. 


ExampLe 4. Let (w, +) be the structure of the integers with addition. 
Th((w, +)) = PAR will be called Presburger’s arithmetic or the theory of 
addition of natural numbers. 


1.2. Elimination of quantifiers for discrete orders 

Thus far we have seen examples of axiomatically defined theories and of 
semantically defined theories. We also gave two principles for establishing 
the decidability of axiomatized theories. We shall now illustrate the 
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method of elimination of quantifiers by proving the decidability of 
Th({w, =)), a result due to C.H. Langford in 1927. 


THEOREM 3. Th(%) is decidable. 


Proor. Let us enrich the structure Jt? by making 0 a distinguished element 
and adding the successor function S(x)= x +1 (the element next to x in 
the ordering). Call the resulting structure Jt; = (w,0, <=, S) and denote the 
corresponding language by L,. We shall decide Th(t,), from which the 
decidability of Th(3) follows. 

Let us use the abbreviation S"(t) to denote n applications of S to the 
term 1, thus S*(x) = S(S(S(x))). In particular, S°(y) = y. The terms of the 
language are 0,x,y,...,S"(0), S"(x),..., la 

The class ® of formulas to which we shall reduce every formula of L, 
will be the following 

(1) t=h, ti <b, where t,t are terms; 

(2) formulas which are disjunction of conjunctions of formulas in (1). 
For example [S°(0) = x a S(y)< z] v S°(z)< S*(y) is in &. 

We shall show that every formula A of L, is equivalent in 3, to a 
formula BE &. Our proof will also provide a method for effectively 
transforming A into B. 

The statement that two formulas C and D are equivalent in 9t,; means 
that 3%, C<D. We shall make assertions concerning equivalence of 
formulas leaving verification to the reader. 

Let A be an open, i.e. quantifier-free formula. Express all other 
propositional connectives by means of v, a, 4. Move all occurrences of — 
next to the atomic formulas, using rules such as 4[C v D] = 79C a7 D. 
Drop all occurrences of double-negation ——. Replace constituents of the 
form t= by h=hvti<h, [h=h by h<hvbh<t, and 4t,<t by 
th = tv t< t). Finally, by use of the distributive law for a and v, transform 
the formula into a formula in &. Thus we see that repeated applications of 
the above rules will transform any open formula into an equivalent formula 
in &. 

Assume now that A, has the form dy (C, v--:v C,] where each C; is a 
conjunction of formulas t, = f: or t; << t:. We have A; =JyCiv---v ayC,. 
Thus it suffices to give rules for transforming a formula of the form 


A. = ay [t= thar tain = hii A 
boier < tasz A A tax < tax]. 


The reader can work out for himself how to treat the case that some 
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equation or inequality in A, is of the form S™(y) = S’(y) or S"(y)< S(y). 
Thus we may assume that in each equation or inequality in A, at most one 
side is of the form S’(y). 

For any terms f,,b, andl=<n, 2, F 4 << S"(t) < S"(t.) and similarly 
for t, = t. By applying to both sides of all equations and inequalities in A, 
appropriate powers S’, we transform A, into an equivalent formula in 
which all occurrences of y are of the form S"(y), with the same 1 < m. Let 
us assume A, already has this property. “Eliminate” dy from A, by: 
(1) dropping Ay from A,; (2) for each conjunct S"(y)= 4 add a con- 
junct $"~'(0) < 4; (3) for each S"(y)< 4 add §"(0)< 1; (4) if any equa- 
tion S"(y) = 4 occurs in A,, replace all occurrences of S"(y) in A, by ¢; 
(5) assuming that no such equations occur and that all inequalities are of 
the form S™(y)< 4, or all are of the form 4 <S™(y), drop all conjuncts 
involving S”™(y); (6) lastly, if no equation involving S”(y) occurs but 
inequalities of both types do appear, then for every pair f <.S™(y) and 
S"(y)<14t, add a conjunct S(t) < ¢,, and later drop all conjuncts involving 
S”(y). It is clear that steps (1}+(6) transform A, into an equivalent formula 
BE& which does not contain y. 

Let A be any formula of L,. Since Vx F = 44x —F, we may assume 
that A contains just existential quantifiers, say n in number. Let Jy D be 
an innermost occurrence of J, i.e. D is open. Transform D into a 
disjunction of conjunctions as explained before. Then dy D = A, where A, 
has the form treated above. Distribute Sy over the disjunctions, and treat 
each disjunct by steps (1)-(6). By these transformations Ay D is replaced in 
A by an equivalent open formula, and A is transformed into an equivalent 
formula with n — 1 quantifiers. Repeating this process n times, A will be 
effectively transformed into a BE &. 

Finally, if A was a sentence, then the transformed formula is a sentence, 
hence a propositional combination of formulas §”(0) = S‘(0) or §"(0)< 
S'(0). The truth or falsehood of such a sentence can be directly ascertained. 
Thus we have a decision procedure for Th(%t,) and hence for Th(%). O 


Let us observe that we could have avoided the passage from 9 to %,, and 
this because the relation y = $"(x) is definable by an appropriate formula 
D,(x, y) in %. This enables us to translate all basic formulas t,< ft. and 
t, = t, into formulas of N, and thus get a reduction class of formulas of Jt. 

Note that if we carry out the quantifier-elimination procedure in ¥t then 
we actually do not get rid of all quantifiers. Rather, we ‘“‘hide’’ them in the 
formulas D,,(x, y). But this still yields the desired results. In general, the 
elimination of quantifiers method should be construed in this broader 
sense. 
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The same decision procedure applies to Th(DO). The only modification 
required is that the various statements concerning equivalence of formulas 
which were mathematically verified for Jt,, must now be derived as formal 
consequences of the axioms DO. Following this route, we not only decide 
Th(DO) but also show that Th(DO) = Th(%), i.e. Th(DO) is complete. 


1.3. Presburger’s arithmetic 

The theory PAR was decided by PRESBURGER [1929] using the method of 
elimination of quantifiers. By an appropriate formula x <,y we can 
express, for each fixed n, the relation x <y Ax =y (modn). Thus, for 
example, x <;y is dz[T4z =Ovx+z+2z+2=y]. Enrich the structure 
(w, +) by making 0,1 into distinguished elements. The terms of the 
language are now 0, 1, x, y,..., and all expressions which are sums of these, 
e.g. x+z+z+1+1+1 abbreviated by x + 2z +3. 

The reduction set X will consist of all formulas obtained from the basic 
formulas ft; = tf, ti<nt2 (ti, t2 are terms, n is an integer) by conjunctions and 
disjunctions. The proof, while not trivial, is not too hard and follows the 
lines of the proof of 1.2. 


1.4 Theory of real numbers 

This is perhaps the best known application of the elimination of 
quantifiers method. We consider the field of real numbers R= 
(R,0,1, +,°, =) as an ordered field. Instead of giving Tarski’s original 
decision procedure we shall outline the algorithm of CoHEN [1969] using 
the formulation in Monk’s thesis. 

We introduce, on a provisional basis, certain algebraic-like functions. 
Let P(x1,...,%1) © Z[X,..., X,] be a polynomial with integral coefficients, 
d,(P) be its degree in x,, and let »€R"'. Define P,(x.)= 
P(m,..-, Mn-1, Xn). For 1 =i = d,(P) define fp, ;(1) to equal 0 if P, =0 or P 
has no real roots, otherwise the i-th real root of P,, = 0 if there are at least i 
such roots, otherwise the largest real root. This makes fp ;(x1,...,%,) a 
term which denotes the function fp; : R”-'— R. Call such terms algebraic 
functions. 

A polynomial relation is a Boolean combination of atomic formulas of 
the form 0 = P, where P € Z[xi,..., x,]. An algebraic relation is a Boolean 
combination of polynomial relations and formulas of the forms, 0= 
P[x,,...,%n-1) fil%1,---, Xn-1)), Or fi = fo, where P © Z[x1,...,x,] and fi, fr 
are algebraic functions. With each algebraic relation a rank is associated in 
such a way that the rank of a polynomial relation is 0. 

The procedure for the elimination of quantifiers will reduce any formula 
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of the original language to an open formula (i.e. polynomial relation) and 
this by passing temporarily through the more general algebraic relations. 

The proof involves two lemmas. The first lemma asserts that a formula 
dx, A, where A is a polynomial relation, is equivalent to an algebraic 
relation B. The second lemma states that an algebraic relation B of rank 
1<k is equivalent to an algebraic relation of rank at most k —1. This 
lemma implies by induction that every algebraic relation is equivalent to a 
polynomial relation. Taken together, these facts ensure that every formula 
is equivalent to an open formula. 

The proofs of the lemmas make use of Rolle’s theorem to the effect that 
between every two consecutive roots of p(x)=0 there lies a root of 
p'(x) = 0. If p(x) is a polynomial, then the location of the roots of p(x) =0 
can be determined, with sufficient accuracy for our purposes, from the 
location of roots of p’(x) =0 and the values p(— ™), p(+). Denote, for 
PEZ[x,...,x,], P' = dP/dx,. It turns out, roughly speaking, that state- 
ments involving fp,;, i.e. statements about the i-th root of P =0, can be 
transformed into statements involving the terms fp. . Within the framework 
of our notion of rank, this entails rank reduction which is the key point in 
the proof of the second lemma. 


1.5. Other theories 

Let us briefly mention some additional theories which we have shown 
decidable by the method of elimination of quantifiers. 

Let @ be the rational numbers, 7 = (Q, =) their order-type. Th(7) is 
decidable. 

Let ALC be the class of algebraically closed fields, then Th(ALC) is 
decidable. Here every formula is equivalent to an open formula, a fact that 
can be established, for example, by employing the classical algebraic 
elimination theory. 

If BA is the class of all Boolean algebras, then Th(BA) is decidable. This 
result is due to TARSKI [1949] and is somewhat difficult. 


2. Model theoretic methods 


2.1. Categoricity, completeness and decidability 

A theory T is called categorical in cardinality a if all models Ue T of 
cardinality c(2{) = @ are pairwise isomorphic. By the cardinality of 2 we 
mean the cardinality of the domain of 2. The following simple observation 
is due to R. Vaught. It is assumed that T is countable. 
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THEOREM 4. If the theory T has no finite models and is categorical in some 
(necessarily infinite) cardinality a then T is complete. 


Proor. Assume, by way of contradiction, that T is not complete. Then 
there exists a sentence S in the language of T such that T,; = T U{S} and 
T, = T U{—S} are consistent. Hence there are countable (i.e. finite or 
denumerable) models T,F %;, T,- 2. Since T has no finite models, 
c(U,) = c(U%.) =. If a =o, then W,=%,, but WS and AF 7S, a 
contradiction. Otherwise there exist, by the Skolem—Tarski-Vaught 
theorem, elementary extensions %, < 6,, U.< 2 so that c(B,) = c(B2) = 
a. Again 8, ~ ®,, 8, S and 8,.F 7S. O 


The stipulation that T has no finite models is essential. Consider the 
theory E of just equality =. E is categorical in every cardinality. Yet 
E U{Vx Wy [x = y]}, as well as E U{Ax dy[—x = y]}, are consistent. 

Perhaps the simplest application is proving that Th(7) is decidable. 
Consider the axioms DNO consisting of the axioms for total-order together 
with 

Vx Vy az [x<yroxr<z<y] 


Vx dy 3z[z<x<y]. 


Every model (A, =) DNO is a totally and densely ordered set with- 
out a first or last element. By Cantor’s characterization of the order 
type n =(Q, =), if c(A)=@ then 7 ~(A, =). Consequently, Th(DNO) 
is complete and hence, by Theorem 1, decidable. Now »-DNO 
so that Th(DNO)C Th(7), but Th(DNO) is complete so that Th(DNO) 
=Th(y). O 


2.2. Algebraically closed fields 

Next we consider the theory ALC of algebraically closed fields. This 
theory can be axiomatized in a language L having symbols 0,1, +,-, by 
writing the usual field axioms and adding a sequence of axioms A,,n = 
Vs Qo ssscets 


An = WYo°** Yn-1 AX [Yo+ yix +++++ Yaie”™ '+ x" =O). 


The axiom A, is written using some obvious abbreviations. 

The axioms ALC are not complete because the characteristic of the field 
has not been specified. This can be done by adding, for a prime p, an axiom 
C, = p+1=0, where the left-hand side abbreviates a sum of terms 1. To 
obtain axioms for characteristic 0 put Co={—7C2, 4Cs,...}. 
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We shall now show that T, = ALC U{C,} is complete for any prime p, 
and Ty) = ALCU CG; is also complete. Let F F T,, where p is a prime or 
p =0, be an algebraically-closed field of characteristic p and cardinality 
w <c(F)= a. Let PC F be the prime-field in F, then P = Z/pZ for p# 0, 
and P = Q if p = 0. By Steinitz’s structure theorem for algebraically-closed 
fields, there exists a set X C F of elements algebraically independent over 
P so that F > P(X) D P is the algebraic closure of P(X). The isomor- 
phism type of F depends just on P (i.e. p) and c(X), the so-called degree 
of transcendence of F. Now if # < c(F) then c(X) = c(F) = a. Hence for 
all p, T, is categorical in every non-countable cardinality w < a. 

By use of Theorem 1, this implies that for each p = 0, the theory T, of 
algebraically-closed fields of characteristic P is decidable. 

Since ALCU C,, ALC U{G,}, p prime, gives an enumeration of all 
completions of ALC, it follows from Theorem 2 that Th(ALC), the theory 
of algebraically-closed fields is decidable. 


2.3. Real-closed fields 

Tarski’s result concerning the decidability of the theory of the field of 
real numbers can also be achieved by model theoretic methods. We first 
need a set of axioms for the field of real numbers. These were provided by 
Artin and Schreier in their famous study of real-closed fields. 

Consider a language L, which, like L of Section 2.2, has 0,1, +,-, but in 
addition has a greater or equal symbol =. The axioms RLC consist of the 
following: the field axioms, axioms stating that = is a total order, and 
furthermore 

Vx Vy Vz [x sy aA082z—-xz = yz], 


Vx Vy Wz[xsyroxutzsytz], 
Vx dy (0<x-—y’=x], 
A, forn=1,3,5,.... 


Here, as in 2.2, A, is the statement that the n-th degree polynomial 
equation has a root. 

The field of real numbers is a model of RLC but by no means the only 
model. Any ordered field (F,0,1, +,:, =)FRLC will be called (ordered) 
real-closed. Th(RLC) is not categorical in any power, so that Vaught’s test 
cannot be used. Completeness, and consequently decidability, are proved 
by means of Robinson’s concept of model completeness. 

A theory T is model complete if for any two models TF U, TB such 
that % CB, we have % <% (® is an elementary extension of Y). 
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Rosinson [1956] gave a test for model completeness. From this test the 
model completeness of RLC can be deduced. The proof, while not very 
hard, does require some effort. Alternatively one can use the following 
test. 

T is model complete if for any two models 2% C 8 of T there exists an 
elementary extension % <€ such that SC ©. Combining this with alge- 
braic properties of real-closed fields and with the method of pltraproducts 
we can get a somewhat different proof for the model completeness of 
Th(RLC). 

Now a model complete theory T need not be complete. However, if T is 
model complete and has a prime model P which is, up to isomorphism, 
included as a submodel in every model of T, then T is complete. It is 
readily seen that under these conditions every two models of T are 
elementarily equivalent. 

The theory Th(RLC) has a prime model. Namely, every real-closed field 
F is of characteristic 0 and hence contains the field Q of rational numbers. 
It therefore also contains an isomorphic copy of the field of real-algebraic 
numbers and this is the common prime field. Consequently Th(RLC) is 
complete and decidable. The field ® of real numbers satisfies # H RLC, 
hence Th(R) = Th(RLC) and is decidable. 

It should be remarked that this approach to the decidability of the field 
of real numbers is, despite the difference in methods, not too different from 
the classical elimination of quantifiers method. At the bottom of the proof 
of model completeness lies the fact that if two ordered fields Fi, F, are 
isomorphic (the mapping preserves also the order) then their real-closures 
are isomorphic. This is proved by using information concerning the 
location of roots of equations. The analysis involved is not too different 
from the examination of the location of roots in the elimination of 
quantifiers method. On the other hand, one can claim that the uniqueness 
of the real-closure is a basic algebraic result established on its own right. In 
the model-theoretic proof of the decidability of Th(RLC) we are thus 
quoting a standard result, and from this point of view the proof is more 
elementary. 


2.4. Theory of DO revisited 

By way of illustrating how model-theoretic methods are useful for 
establishing decidability even in the absence of categoricity, let us re- 
examine Th(DO). 

In 1.1, Example 2, we observed that every model of DO has the 
order-type w+(w*+w)A. We shall show that every countable model 


cH. C.3, §2] MODEL THEORETIC METHODS 611 


= {A, <)- DO has an elementary extension % < & for which the A is 7. 
This will imply Th(2l) = Th(%), hence every two countable models of DO 
are elementarily equivalent, and hence Th(DO) is complete and decidable. 

Define an equivalence relation E on any model YE DO. xEy = 
c({z | x<z<yvy<z<x})< a, i.e. the number of elements between x 
and y is finite. We see that the equivalence classes with respect to E are the 
“blocks” w, and each w*+w, in the order type w+(w*+o)A of YF. 
Consider any two blocks (equivalence classes) B,, B, of %, without loss of 
generality let x < y for all x & B,, y © By. It is consistent with all the 
elementary statements about %, i.e. with the complete diagram CD() of 
%, to assume the existence of an element c such that x <c <y for all 
x © B,, y © B2. Thus CD(2) and all these inequalities have a countable 
model %, which is an elementary extension % <%, of %. In this 2%, the 
block B of c lies between B, and B,. Similarly we can construct an 
extension with a block above B,. Because the extension % <2, is 
countable, it is possible to construct a tower of elementary extensions 
W<%W,<-+--, so that each 2, is countable and for each pair B,, B2 of 
blocks of each 2%, there exists an n<k and a block B of 2, situated 
between B, and B;, and similarly for a block above B,. Let 8 = U,<. Y,, 
then %f < ¥, B is countable, and the blocks of B are densely ordered. Thus 
the order type of 8 is w + (w*+w)n, which completes the proof. 


2.5. Further results 

Many additional important results we obtained by model-theoretic 
methods, sometimes in combination with the method of elimination of 
quantifiers. Let us mention without proofs a few outstanding theorems. 
The proofs usually involve deep mathematical results concerning the 
structures in question, making for an interesting combination of standard 
mathematics and logic. 


THEOREM 5 (Ax and KocueEN [1965a, 1965b, 1966]). The theory of p-adic 
fields is decidable. 


This result laid to rest a long-standing conjecture that the only decidable 
fields are the real-closed and the algebraically closed fields. 


THEOREM 6 (Ax [1968]). The theory of the class of all finite fields is 
decidable. 


THEOREM 7 (SZMIELEw [1954]). The theory of commutative groups is decid- 
able. 
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THEOREM 8 (EHRENFEUCHT [1959]). The theory of linearly (totally) ordered 
sets is decidable. 


This result was announced in an abstract. The first full proof to be 
published is due to LAUCHLI and LEONARD [1966]. As will be seen in the 
next chapter, this result is an easy consequence of the method in RaBIN 
[1969]. 

Finally we shall quote a special case of more general results relating 
models to decidability. 

Let MY =(A, fo,..., fi... dice and B=(B,go,...,2i,.--)icas @ Sw, be 
similar algebras of the same type, i.e. f, and g; are n;-ary operations on 
and ‘8 respectively. The direct product %{ x 8 is defined in the obvious 
manner. 


THEOREM 9. If Th(2!) and Th(%) are decidable so is Th(% x B). 


This is but a special case of a general study of the first-order properties of 
products of structures and classes of structures initiated by MostowskI 
[1952] and developed by FEFERMAN and VAUGHT [1959]. We have restricted 
ourselves to a very special case in order to avoid the elaborate definitions 
appearing in the general theory. 

The method of products is a powerful tool for obtaining new decidability 
results from known ones. The following example is due to Mostowski. We 
know that Th((w, + )) is decidable, see Section 1.3. From the theory of 
general products it follows that the direct sum 8 of w copies of (w, + ) has a 
decidable theory. The domain of P consists of all w-length vectors 


v =(no,..., m,0,0,...), n€w, k=0,1,..., 


and the operation is component-wise addition. Define a mapping ¢$(v) = 
po°-+- pk, where p, is the (i+ 1)-th prime. The mapping @ is an isomor- 
phism @ : 8 — (w — {0}, -) = Vt onto the multiplicative semi-group of inte- 
gers. Hence Th(2) is decidable, a result due to Skolem. 


3. The method of interpretations 


3.1. Semantic interpretations 

We shall start by outlining what is meant by obtaining a structure 
%=(A,R) from a structure 8 =(B,S,,S2,...) by means of definable 
relations. We need some preliminary notions. Let L be the language of 


CH. C.3, §3) THE METHOD OF INTERPRETATIONS 613 


and L, be the language of 8. Let D(x, y,...) be a formula of L, with x as a 
free variable but possibly containing other free variables y,..., which will 
play the role of parameters. Abbreviate D(x, y,...) by D(x) or even D. 

If F is a formula of Li, then the formula F? obtained from F by 
relativizing all quantifiers of F to D, is defined inductively on the structure 
of F by the following rules. If F is quantifier-free then F° = F. If 
F=EvG or F=—E then D? = E” vG? or F” = -E”, respectively. 
The crucial cases are F = 3uG and F=VWuG: 


(4uG)? =3u[D(u)a G"], (WuG)? =Wu[D(u)>G?]. 


Here D(u) means D(u,y,...), i.e. uw substituted for x in D. Note that in 
order to correctly effect the relativization, we must sometimes alphabeti- 
cally change the names of certain variables in F in order to avoid binding a 
variable other than x, which is free in D. 

Let b € B,..., be a sequence of values in B for the parameters y,..., of 
D. Define C = {a | Bk D(a, b,...)}. This is the domain defined by D and 
the specialization y = b,... of the parameters. The subset C C B induces a 
substructure © =(C,S,|C,5S2|C,...) of 8. 


LemMaA. Let F(z,,..., 2.) be a formula of L,, and let D, B, bE B,..., and 
C be as above, then for c1,...,€n EC 


BEF? (c1,...,en) iff Ce F(c,...,cn). 


Thus the effect of relativization is to convert satisfaction of the formula F 
in % to satisfaction in the substructure C. 

A somewhat more complex construction is the following. For the sake of 
the simplicity of the notation, let us restrict ourselves to a formula D(x) of 
L, containing just the free variable x (and no parameters) and a formula 
E(u, v) with two free variables. 


DEFINITION 3. The structure 8(D, E)=(C, R) induced in ® by D(x) and 
E(u, v) has, by definition, the domain C = {c | B  D(c)} and the binary 
relation RC CXC, R = {(b,c) | b,c EC, BK E(B, c)}. 

Let now F(z,,...,2,) be a formula in a language with a binary predicate 
symbol P and define F”* to be the formula obtained from F by first 
forming F? and then replacing in F? all atomic formulas P(z,, z2) by 
E(z,, 22). Note that the quantifiers in E(u, v) are not being relativized to 
D. The following lemma is actually a corollary of the previous lemma. It 
relates satisfaction in the induced structure to satisfaction in 8. 
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LemMaA. Fer ¢1,...,¢, EG, 
®B(D, E)FF(c1,...,¢.) iff BEF? *(c1,...,¢n). 


The application to decidability rests on the following theorem taken 
from RaBiN [1965]. 

Let T and T, be theories in the languages L and Li, respectively, and let 
KH and HX, be classes of structures such that T= Th(#), Ti = Th(%). 
Assume that L has the predicate symbols Po,...,P, and no operation 
symbols. 


THEOREM 10. Let D(x, y,...) be a formula of Li, and E = (Ev,...,Ex) bea 
sequence of formulas so that if P, is n,-ary then E, has n, free variables, 
Osisk. 

Assume that (1) For all 8 € X, and all values y = b,... of the parameters 
of D, 8(D, E)& T. (2) For every U=(A, Ro,..., Ri) EH, there exists a 
model © € H, and a specialization y = b € B,... such that for this special- 
ization % ~ B(D, E). 

Under these conditions, if T, is decidable then so is T. Conversely, if T is 
undecidable then so is T;. 


Proor. Let F bea sentence of L, put G = Vy --- F”-* where the universal 
quantification is over all the parameter-variables in D(x, y,...) (these 
variables are free in F, if F did contain quantifiers). By the second lemma, 
for any BE HX, and specialization y= bEB--- 


(*) BEF? F(b,...) iff B(D, E)EF. 


Let now F € T. Condition (1) implies 8(D,E)FF for any BEN, 
y = b,.... Hence the left side of (*) holds, hence OF G. But BE H, was 
arbitrary, hence G € Th(#1) = Th. 

Next assume G € 7;. Let YE H, then, by (2), for some BE XH and 
y=b,..., & ~B(D, E). We have BE G, hence SF F” ¥ (b,...). Therefore, 
by (*), 6(D, E)FF, hence YE F; consequently HEF and Fe T. 

Let T, be decidable and F be any sentence of L. Form G; since G & T, 
iff F © T, we can determine whether FET. O 


Remark. It is readily seen that if T is finitely axiomatizable then condition 
(1) can be dispensed with by modifying the construction of G. 


We have stated Theorem 10 for first-order languages. With appropriate 
changes it also holds if L, or even both L and L, are second-order 
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languages. The only case of second-order languages which is of any interest 
from the point of view of decidability is that of monadic second-order 
languages which have variables ranging over subsets of the domain but no 
variables ranging over relations on the domain. This is because once we 
have a variable ranging over, say, binary relations, the theory of all true 
sentences of the second-order language is undecidable. 

Assume that L, has set variables A, B,.... Then the relativizing formula 
D(x) may be of the form x € A and A will be a parameter in F””. If L has 
set variables then they must also be relativized to D by rules such as 


(WAF)? =WA [Vx [x € A> D(x)]—> F?]. 


With these natural modifications, Theorem 10 holds for monadic second- 
order languages. 


3.2. Decidable second-order theories 

Let us consider monadic second-order languages L, which have set 
variables A, B,..., and the € relation. To be appropriate for a structure 
= (A, R) where R is, say, a binary relation, L, must also have a binary 
predicate symbol P. For such a language L, the (monadic) second-order 
theory Th,(2) of 2 is the set of all sentences of L, true in %&. Similarly we 
define Th,(%) for a class X of similar structures by Th(#) = NaexTh.(%). 

The first significant results of second-order decidability, beyond the 
decidability of just pure monadic second-order logic, deal with the 
decidability of Tho((w,S)) where S(x)=x +1-is the successor function. 


THEOREM 11 (BUcHI [1962]). Th2((w, S)) is decidable. 


This result was actually preceded by a weaker version. Consider a weak 
monadic second-order language LW which has set variables a, B,... which 
are restricted to range over finite subsets of the domain. The theory of a 
structure 2 in the language LW, will be called the weak (monadic) 
second-order theory of 2% and denoted by Th,(). 

Bocui [1960] and Excor [1961] have proved that Th.((w,S$)) is 
decidable. 

For future reference, denote Th.(w, S) = S1S (the second-order theory of 
one successor function) and Th.((w, S)) = WSIS. 

We cannot enter into details of these decidability proofs. Let us just say 
that they utilize methods and results of automata theory. The case of WS1S 
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employs concepts and results from theory of automata operating on finite 
sequences and is very simple and transparent. The proof of decidability of 
S1S required a new notion of an automaton operating on an infinite 
w -sequence. 


3.3. The tree theorem 

Most of the proofs of decidability by interpretations involve the Tree 
Theorem due to Rabin [1969]. 

Let T = {0, 1}* be the set of all finite words (sequences) x = x:xX2°** Xn; 
x; € {0, 1} on the alphabet {0,1}. The empty sequence A is also in {0, 1}*. 
The set T can also be interpreted as the infinite binary tree (see Fig. 1). 
Arbitrarily assigning 0 to left and 1 to right, the correspondence between 
node of the tree and T is as follows. The root corresponds to A; the right 
successor (son) corresponds to 1, and the left successor to 0; the left 
successor of 1 is 10, etc. 


Fig. 1. 


Thus we have on T the two successor functions ro(x) = x0, ri(x) = x1, 
x € T. Let L, be an appropriate monadic second-order language having 
operation symbols ro, r:. The set of all sentences of L, true in (T, ro, r:) will 
be denoted, as usual, by Th.((T, ro, r:)). 


THEOREM 12 (Tree Decidability Theorem). The second-order theory of two 
successor functions S2S = Th.((T, ro, r:)) is decidable. 


The proof of this theorem requires a far-reaching extension of the theory 
of automata to cover the case of a finite automaton operating on an infinite 
tree. One interesting feature of the proof is that even though we want to 
establish certain facts concerning finite objects, namely the finite automata, 
transfinite induction over ordinals up to the first uncountable cardinal , is 
used in an essential way. 
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3.4, Presburger’s arithmetic revisited 
Biichi and Elgot used the decidability of WS1S to prove the decidability 


of PAR. 
The key idea is to use finite sets a C w to code integers. Let xy, be the 
characteristic function of a. Define 


n(@)=1-ye(O)+2+yo(1) +++: + 2% + Ya(x)tee:. 


We shall construct a formula A (a, B, y) in the language of WSIS which 
will be true in (w, S) for a, B, y € w if and only if n(a@)+ n(B) = n(y). This 
is done by considering the sequence yz of carriers in the addition of n(B) to 
n(q@) as binary numbers. 


A(a, B, y)= 36 Vx[T0E 6A 
[S(x)E6exECanxEBvxrEan 
xEbvXEBaxEd alxeyo 
xXEaAxECBrnxEsyv 
XEaA~xECBan~xES::-]]. 

By systematic use of A(a, B, y) to replace a+ b= c in a sentence F of 


PAR, a sentence G in the language of WSIS is obtained. We have 
FE Th({w, + )) iff G € WSIS so that we can decide whether F is true. 


3.5. Second-order theory of linear order 
Let K2 be the class of all totally ordered sets (A, =)FOR with a 
countable domain, c(A)So. 


THEOREM 13 (RaBIN [1969]). Th2(K2) is decidable. 
Proor. We can define on T the partial-order x = y, x is an initial segment 
_ of y, by a formula x = y. Namely, 
xsy=VA[xEAAVz[z ECA r(z)EAAr(z)EA]—>y EA]. 
Also the lexicographic order x < y is definable 
x¥Sy=xsyv4z([r(z)sxari(z)sy]. 


The ordered set ({x1 | x € T}, <) has order type 7. Therefore for every 
countable ordered set (A, =) there exists a set A C T such that (A, =) ~ 
(A, <). Using the relativizing formula D(x,A)= x €A and replacing = 
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by <<, we see that Th.(K2) can be interpreted in S2S in the manner of 
Theorem 9. Hence Th.(K2) is decidable. O 


As simple corollaries we get difficult classical results. 
Coro._ary. Th(OR), the theory of linearly ordered sets, is decidable. 


The Skolem-Léwenheim theorem implies that every (A, =)FOR is 
elementarily equivalent to a B © KZ. Hence Th(OR) = Th(K2). The latter 
theory is, of course, decidable. 

The monadic second-order language is sufficiently powerful to express 
the fact that a set is well-ordered. Namely, the sentence 


W=VWAVxdyVz[xEA>yeEAan[zEA>ysz]] 


has the property that a linearly ordered set 2 satisfies 2 = W if and only if 
% is well-ordered. This immediately leads to 


COROLLARY. The monadic second-order theory of countable well-ordered 
sets is decidable. 


Proor. For any sentence F we have K2F W — F if and only if F is true in 
all countable well-ordered sets. O 


Every well-ordered set (B, =) has a countable elementary submodel 
Wf = (A, =)<(B, =). Hence the first-order theory of well-ordered sets is 
the same as the first-order theory of countable well-ordered sets which is 
decidable by the previous corollary. Thus the first-order theory of well- 
ordered sets is decidable, a result due to Tarski and Mostowski [1949]. 


3.6. Decidability in topology 
It is possible to define in S2S the notion of a path A C T going from the 
root to infinity 


Path(A)=A EA AWxVy[x CA [no(x)EAvri(x)EA]A 
[ysx—yEA]n7[nro(x)EA Ar(x)EA]]. 
Consider {0, 1}° with the usual Tychonoff product topology. This is the 
well-known Cantor Discontinuum CD. For every point p : {0,1} , the 


set A C T of all finite initial segments of p, is a path of T and this is a one- 
to-one correspondence. We can also reproduce in S2S the topology of CD. 
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Let B C T be aset which is a union of paths, then the set of paths A C B is 
a closed subset of CD, and this again is a one-to-one correspondence. 
Define 


CL(B) = Vx [x € BGA [Path(A),A CBax EA]. 


THEOREM 14 (RABIN [1969]). The monadic second-order theory of CD, with 
the set variables restricted to range over closed sets, is decidable. 


Proor. Let F be any sentence in the language of CD. Relativize all 
individual variables to Path(X) and all (closed) set variables to CL(B), 
replace all formulas x € B of F by X C B. The resulting sentence E is true 
in S2S if and only if CDF F O 


With slight changes, accounting for the fact that two different sequences 
p,q € {0,1}° may represent the same element of the real-line segment 
[0,1], the above proof may be modified to cover the case of [0, 1]. 


3.7. Boolean algebras 
The following theorem settles the decidability of Th(BA) — the elemen- 
tary theory of Boolean algebras and much more. 


THEOREM 15 (Rasin [1969]). Let 8. be the free Boolean algebra on w 
generators and let LI be a second-order language appropriate for Boolean 
algebras with set variables ranging over ideals of the algebras. Th,(®..), the 
theory of %., in the language LI, is decidable. 


Proor. The set of all closed-and-open (clopen) subsets of CD is a Boolean 
algebra isomorphic to 8. hence we can identify it with 8... If I C B., is any 
ideal then U(I) = UserS is an open set U(I)C CD. The sets U(I) run 
through all open sets of CD and the correspondence is 1-1. Furthermore, 
for SE B., SC U(J) if and only if SEL 

Every sentence of LI can therefore be translated into a sentence about 
CD by relativizing the individual variables to set variables ranging over 
clopen subsets of CD, relativizing the ideal variables to variables ranging 
over open sets (i.e. complements of closed sets) of CD, and replacing x € I 
by X CI. The transformed sentence is true in CD if and only if the original 
sentence is true in Th,(B,), and the former question is decidable. O 


Let now B be any countable Boolean algebra. Then there exists an ideal 
Jc®. so that 8~B,/J, and the ideals IC % are in a natural 1-1 
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correspondence with the ideal J CJ, C 8.. By the method of interpreta- 
tions, this will yield the following theorem. Denote by BA” the class of all 
countable Boolean algebras. 


THEOREM 16. Th,(BA.), the theory in the language LI of all countable 
Boolean algebras, is decidable. 


If we restrict ourselves to sentences G = VI,::-VI,F(h,,...,1,), where 
F is a formula without any quantification over ideals, then G € Th;(BA®) 
iff F(U,,...,J,) is true in every countable Boolean algebra 8 = 
(B,U,N,',h,b,...), where I,,I,,..., are ideals of B. Using the 
Skolem-Léwenheim theorem we immediately get: 


THEOREM 17 (RABIN [1969]). The elementary theory of all Boolean algebras 
with a sequence of distinguished ideals is decidable. 


This theorem considerably strengthens the result of ErsHov [1964] which 
asserts the decidability of Boolean algebras with a distinguished prime 
ideal (ultra-filter). 


3.8. Non-classical logics 

Thus far all the results presented in this paper dealt with theories 
formalized within classical logic. Many extensions and modifications of 
classical logic appear in the literature. These may take the form of rejection 
of certain axioms of classical logic, the intuitionistic logic is an example, or 
the addition of logical operators or connectives, as is being done in modal 
or tense logics. Important philosophical considerations and attitudes 
towards the foundations of mathematics and logic motivate the introduc- 
tion and study of these systems. 

While the decidability of the classical propositional calculus is trivial, the 
decidability of these fragments or enrichments (by addition of logical 
operators) of even the propositional logic is in most cases far from obvious. 
The method of interpretations turned out to be a powerful tool for settling 
almost all the decidability questions in this field. We shall illustrate this by 
two examples. The interested reader should consult the article of GABBAY 
[1975], which is the source for these examples, and where many other 
results and references are to be found. 

The class of modal propositional logics to be considered here has, besides 
the usual propositional connectives, the operator 0 which is intended to 
express necessity, so that DA should be construed to mean: necessarily A. 
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A basic axiom system C-2, has all the theorems and rules of classical 
logic, and in addition the axioms and rules of inference 


D[A > B]—> [DA > OB}, 

from +tA-—B_ to infer ‘OA OB. 

The system K is obtained from C-2 by adding the rule 
from +A_ to infer OA. 


The system T is K plus the axioms DA — A. And the system $4 is T plus 
the axioms 0A > OO0A. 

There are many other extensions of C-2. The particular axioms are 
chosen by the various authors on the basis of their beliefs as to what the 
correct properties of O) should be. 

Let us now describe the intuitionistic tense logic J,. This system will have, 
besides the connectives >,v,A,f (denoting falsehood), the operators G 
and H. For a formula A, GA reads: ‘‘A will always be true’, and HA 
reads: ‘‘A was always true’. The formula “A abbreviates A —f. 

The axioms and rules of inference for J, will be those for the intuitionis- 
tic propositional logic (including modus ponens), and in addition 


G[A — B]—[GA — GB], 
H(A — B]—[HA — HB], 
A v GHA, A v H—-7GA, 
from +A toinfer #GA and +HA. 


Kripke, Gabbay, and others, gave for many non-classical logics systems 
of semantics based on trees and valuations on trees. A formula would then 
be provable if it has a certain property under all possible valuations or 
interpretations. The detailed definition of interpretations would, of course, 
depend on the system of axioms in question. 

It turns out that these tree-semantics are expressible in S2S and variants 
thereof. This makes it possible to derive a multitude of positive decidability 
results from the Tree Theorem. 


4. Complexity of decision procedures 


4.1. Turing machine computations 
As remarked in the introduction, many results concerning lower bounds 
on the complexity of solvable decision problems appeared in recent years. 
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In particular almost all the theories discussed in this chapter were shown to 
have decision problems which do not admit of any simple decision 
procedure. ; 

Since our aim will be to show that every decision procedure for a theory 
T is complex, we must settle on a definite formulation for algorithms and a 
definite convention for counting computational steps. We shall choose 
Turing-machine algorithms as our decision procedures, and the execution 
of an atomic instruction will count as a basic step. 

The results will be of the form that for any decision procedure P for the 
theory T in question, there are sentences A of size n (i.e. written by use of 
n symbols) for which P will require at least f(cn) steps to produce an 
answer as to whether A € T or not. The function f(n) will be at least 
exponential 2", and c will be a fixed number 0<c. Because of the 
exponential nature of the results, it will make little difference which model 
of algorithms and computations is chosen. Computation-times in different 
models for the same algorithm differ by at most a polynomial transforma- 
tion. 

The method for obtaining these inherent complexity results rests on the 
following observation. 

Let us transcribe programs for Turing machines in a uniform standard 
way by sequences P € {0, 1}*. For any word x define /(x) to be the length 
of x, i.e. the number of symbols in x. In particular, for an integer written in 
binary notation, /(n) is the number of digits in n. 

Let T bea theory in the language L, and f(k) be a function satisfying the 
following conditions. There exists a constant 0<d so that for every 
program P and integer n, there exists a sentence S(n, P) of L satisfying: 

(i) [(S(n, P))= d(l(n) + I(P)), 

(ii) S(n, P) € T if and only if a computation by the program P on input 
n (viewed as a 0-1 sequence) halts in fewer than f(I(n)) steps. 

(iii) The formula S(n, P) can be effectively computed from n, P in fewer 
than g(/(n)+ I(P)) steps, where g(k) is a fixed polynomial. 

If f(K) is a function growing at least at exponential rate, then under the 
above conditions there exists a constant 0<c so that for infinitely many n 
there exists a sentence F of L, [(F) = n, for which P requires at least f(cn) 
steps to decide whether F € T. 

The proof of the above statement is by a familiar diagonalization 
argument. One asssumes, by way of contradiction, that there exists a 
decision procedure P for T which requires for every sentence F, fewer than 
f(cl(F)) steps to decide whether F € T. If c = 1/2d then, by use of the 
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sentence S(n, P), one can construct a Turing machine which stops on an 
input no if and only if it does not stop on that input. 


4.2. The theory WS1S 

MEYER [1975] has proved that the decision problem of WSIS is of a very 
high inherent complexity. 

Define a function F(n,m) by 


F(n,1)=2", F(nm+1)=2°"™, = m=1,2,.... 


If 0< d, then f(n) = F(n,[dn]) is a function which is an exponentiation 
by a linear stack of 2’s. 


THEOREM 18. There exists a constant 0< d so that for the function f(n) = 
F(n,[dn]), and every algorithm P for solving the decision problem of WS1S, 
there exist infinitely many formulas A so that P requires more than f(I(A)) 
steps to decide whether A € WSIS. 


PROOF (outline). We have available in the language of WSIS variables 
a, B,..., ranging over finite subsets of w. A pair of subsets a, 8 can be used 
to code a sequence p € {0,1}*, [(p) = c(a@). If @ = {in << i) < +++ <i, -,}, then 
i; © B if and only if p(j) = 1. For a fixed @ and variable B, the pairs (a, B) 
will run through codes of all sequences p such that I{p)=c(a). 

For the above a and x € a, y € a, we shall say that x and y are d apart 
ina if x =i, y =i.4 for some j =k —-1-d. 

One can now show that for every n there exist two formulas A, (q@) and 
D,,(a, x,y) of WSIS which are of length O(n) and have the following 
properties. A,(q@) implies that a has a certain structure and is at least of 
cardinality (F(n, n)). For sets a for which F,(@) holds, D, (a, x, y) means 
that x and y are at distance F(n,n) apart in a. 

Suppose that we have a Turing machine computation with fewer than 
F(n,n) steps. Then the machine-head will never be farther than F(n, n) 
squares away from the starting square. We can assume without loss of 
generality that the machine never crosses to the left of the starting square. 

We can string out in order, from left to right, the complete descriptions 
of the stretch of the first (leftmost) F(n, n) squares after the execution of 
each of the machine-instructions. This will be a sequence of length at most 
(F(n, n)). This sequence can be coded by use of an a@ C w which satisfies 
A,(q@), and additional sets Bo, Bi,..., Bx. The pair (a, Bo) will code the 
tape-contents, and ((a, B.),...,(@, Bx)) will code the sequence of head 
locations and machine states. 
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The formula D,, (a, x, y) will serve as a ‘‘ruler’”’ measuring off stretches of 
length F(n, n). Together with the (definable) order on @, it will enable us to 
express the fact that two consecutive stretches of the sequence coded by 
(a, Bo,..., Bk) are related by an execution of a single Turing machine 
instruction. 

Filling out the details and combining the above ideas, it is possible to 
show that there exists for WS1S a construction of a formula S(n, P) with 
the properties enumerated in 4.1. This entails Theorem 18. O 


It was independently observed by E. Robertson and by L. Stockmeyer 
(in his thesis) that a close examination of the full proof of Theorem 18 
reveals that it will go through for sentences pertaining to (w, =) which are 
universal monadic second-order. This means sentences which may contain 
set quantifiers but these are all V quantifiers and appear at the beginning of 
the sentence. In fact, a single set variable will suffice. A method of direct 
interpretations will yield from this more detailed result the following 
theorem due to MEYER [1974, 1975]. 


THEOREM 19. The first-order theory Th(OR) of linearly ordered sets has 
inherent complexity F(n,[dn]) for some 0<d. 


The detailed formulation of Theorem 19, as well as of the results in the 
next subsection, is as in Theorem 18. 

An analysis of the automata-theory based decision procedures for WS1S 
and even S2S shows that they run in time F(n,[cn]) for formulas A of size 
n, for an appropriate 0<c. In view of Theorems 18-19 these results are, 
qualitatively, best possible. There is, of course, the question of the height 
[cn] of the stack of 2’s, but this depends on the notation for the formulas 
and does not seem to be readily answerable. 


4.3. Theories of addition and real-closed fields 

For the classical theories Th((w, + )) = PAR, and the theory of the field 
of real numbers Th(RLC), the inherent complexity results are not as 
devastating as for WS1S. It does, however, turn out that these theories are 
at least exponentially complex, and in some cases super-exponentially 
complex. Thus the contention that the existence of a decision procedure 
trivializes these theories is not justified. 


THEOREM 20 (FISCHER and RaBIN [1974]). There exists a constant 0<c so 
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that the decision problem of Presburger’s arithmetic PAR is at least of 
complexity 2°”. 


ProoF (outline). We saw that the ability to establish inherent complexity 
results for’a complexity function f(n), rests on the possibility to code within 
the theory by formulas of size O(n) sequences of length (f(n))’. 

There exist in the language of (w, +) formulas P,(x, y, z) of size O(n), 
which are true for any x,y,z €w if and only if x,y,z < F(n,3) and 
x+y =z. Thus such a formula, which involves only + and is of size O(n), 
codes the multiplication table up to rae Using integers to code 0-1 
sequences, it is now possible to code sequences of length up to (2”")° by 
employing P,.:(x,y,z). O 


The lower bound for the complexity of Th(RLC) = Th((R, +,-°)), where 
R is the field of real numbers, is obtained by considering just (R, +). 


THEOREM 21 (FIiscHER and RaBin [1974]). Th((R, +)), and consequently 
also Th(RLC), are of inherent complexity at least 2° for some 0<c. 


The proof is similar to the proof of Theorem 20. In this case it is possible 
to reproduce (up to isomorphism) by a short formula the multiplication 
table of integers up to 2”. 

For the theory of Dt =(w,-) of the integers under multiplication, the 
situation is even worse according to a theorem mentioned in FiscHER and 
Rasin [1974], the proof of which will be given in a forthcoming paper of 
Fischer and Rabin. 


THEOREM 22. The theory Th{w, -)) of multiplication of natural numbers is of 
inherent complexity at least F(cn,3), i.e. 2” , for some 0<c. 


Algorithms carefully constructed by various researchers strongly suggest 
that the above lower-bound results are best possible. 


4.4. Propositional calculus and P = NP 

It is customary in the study of abstract computation-models to make a 
distinction between deterministic and non-deterministic algorithms. When 
presented with a state-symbol combination, a Turing machine will execute 
a definite basic computational step (atomic move). The overall course of 
the computation is, therefore, completely determined by the Turing 
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machine (program), the starting state, and the initial tape-input. The 
deterministic mode is, of course, a feature of all present-day computers. 

The notion of non-deterministic computations, or programs, is of 
fundamental importance in theoretical studies of the properties of al- 
gorithms. A non-deterministic program P allows, when presented with a 
state-symbol combination, the execution of one out of possibly several 
basic moves. For example state q3, when presented with 1, may call for 
either (0, L, 7) (erase 1, move to left, go to state q;) or (1, R, 15). Thus, ina 
non-deterministic program, to each pair (q,b) where q is a state and b a 
symbol, there corresponds a set of triplets (c,M,qi), c is a symbol, 
M €{L, R}, qi is a state. 

When started in state qo on an input tape, the program P may be able to 
go through any one of several sequences of basis steps, i.e. perform 
different computations on the input. It should be borne in mind that in each 
particular run a definite unique computation is performed. But several 
different runs, or threads, may be possible. 

Let us illustrate the idea by showing that there is a non-deterministic 
program P which will factor any composite number n in f(/(n)) steps 
where f(k) is a polynomial. 

The program P has non-deterministic instructions enabling it, when 
given input n (in binary notation), to write on the tape any two numbers 
1<b, c <n. As observed before, in any given run, one pair (b, c) will be 
written. But for every pair, there exists a run producing that pair. After b,c 
was produced, the program switches to a deterministic mode, calculates 
b-+c and checks whether n = b-c. The machine will stop only if the test 
showed equality. 

We may observe the following features. Not every computation by P on 
n will halt. But if n is indeed composite then there are computations which 
will halt after a number of computational steps which is polynomial in the 
size I(n) of the input. 

This can be summarized by saying that compositeness of numbers can be 
non-deterministically recognized in polynomial time. 

Consider now the problem of determining whether a propositional 
formula F(p,,...,p,.) has a truth-values substitution for the propositional 
variables, so that F becomes true. This is the satisfiability problem for the 
propositional calculus. 

Since F(p.,..., pn) is not satisfiable if and only if F(pi,..., pa) is a formal 
theorem of propositional calculus, the satisfiability problem is closely 
related to the decision problem of PC. 

It is again easy to construct a non-deterministic program which will 
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determine whether F is satisfiable by a number of steps which is polyno- 
mial in /[(F). 

Does there exist an ordinary, deterministic, decision procedure for 
satisfiability which requires time (i.e. number of steps) which is just 
polynomial in the size of the formula? The polynomial in question may, of 
course, be faster growing than the polynomial for the non-deterministic 
program. 

The answer to this question is not known. However Cook [1971] has 
shown that: 


THEOREM 23. [If the satisfiability problem of the propositional calculus can be 
(deterministically ) solved in polynomial time, then any problem which can be 
solved non-deterministically in polynomial time can also be solved in 
polynomial time by a deterministic algorithm. 


Thus the question whether there exists an efficient (polynomial) al- 
gorithm for satisfiability is equivalent to the question whether the class P of 
algorithms requiring polynomial time is equipotent with the class NP of 
non-deterministic algorithms requiring polynomial time. This is the cele- 
brated P = NP problem. 

The algorithms in NP are very powerful. For example, an isomorphism 
between two given graphs of size n can be non-deterministically found in 
polynomial time. Similarly for an Hamiltonian circuit. These are difficult 
combinatorial-computational problems which defied repeated attempts at 
simple solutions. 

Cook, Karp [1972], and others, found many examples of combinatorial 
decision problems which are reducible and in a certain sense equivalent to 
the satisfiability problem. The weight of this evidence may point in the 
direction that the satisfiability problem, being so powerful, is not of 
polynomial complexity and hence P# NP. But this fundamental question 
is, as yet, unanswered. 
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1. Introduction 


This chapter is intended for readers who have at least a nodding 
acquaintance with some basic ideas from recursive function theory. The 
ideas can be found in Chapter C.1. 

The structure of the degrees is essentially a classification of all sets of 
integers according to complexity. The recursive sets of integers comprise 
the lowest level of complexity. A degree (sometimes called a Turing degree 
or a degree of unsolvability) is just an equivalence class of sets of integers, 
under the following equivalence relation: S, is recursive relative to S2, and 
S2 is recursive relative to S,. Trivially, each set of integers falls into one of 
these equivalence classes. The degrees are partially ordered as follows: 
d, > d, if and only if sets in the equivalence class d, are more nonrecursive 
than sets in the equivalence class d). 

The simple idea just described leads to a rich and interesting theory. In 
fact, for many years now, degree theory has been one of the most technical 
and highly developed parts of mathematical logic. There are literally 
hundreds of papers in the literature, all devoted exclusively to degrees. The 
standard of originality in these papers is very high. Although certain ideas 
recur, the variety of methods employed is enormous. Some workers even 
make a point of not publishing a theorem unless the proof illustrates a new 
technique. 

The situation just described presents a hardship to him who attempts to 
bring order and clarity to degree theory. Nevertheless, there do exist some 
truly excellent surveys, including LACHLAN [1973], Sacks [1963], SHOEN- 
FIELD [1971], SoARE [1976], and Yates [1976]. (Perhaps SHOENFIELD [1971] 
is best for the beginner.) In keeping with the way degree theory has 
developed, all of the papers just mentioned lay heavy emphasis on methods 
of proof. 

In the present survey we indulge in a few methodological remarks. 
However, we have made a conscious decision to neglect methodology and 
to concentrate instead on results. Thus we try to present those theorems 
whose statements alone shed the most light on the structure and uses of 
degrees. Such theorems do not always have the most interesting proofs. 
Our reasons for this choice of emphasis are the following: 

(1) As mentioned above, there already exist several excellent 
methodological surveys, on which we could not possibly hope to improve. 

(2) Our typical reader is probably not himself planning to perform 
research in degree theory. He is therefore likely to be more interested in 
results than in methods of proof. 
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(3) We feel that now would be a good time for degree theorists to take a 
critical, backward look at the concrete accomplishments in their field. We 
hope that a survey of results will promote this kind of introspection and 
thereby suggest directions for future research. 

Another aspect of degree theory which we neglect here is history. This is 
partly because we prefer to discuss later, comprehensive results instead of 
earlier, fragmentary ones. Besides, an historical approach might be mis- 
leading, inasmuch as the general conception of what degree theory is all 
about seems to have changed over the years. Nevertheless, in order not to 
commit the crime of being totally ahistorical, we now point out the obvious 
fact that the published literature of degree theory begins with papers of 
Emil L. Post. (Sections 5 and 6 of the present survey discuss problems 
which have their origin in Post [1944]. Sections 2, 3, and 4 are closer in 
spirit to Post [1948] and KLEeENE and Post [1954]. Serious historical 
difficulties arise from the fact that much of Post’s work remains unedited, 
unpublished, and inaccessible.) 

Since the days of Post, degree theory has turned out to have a number of 
connections with other parts of mathematical logic. Unfortunately, limita- 
tions of space prevent our discussing these aspects here. Another impor- 
tant topic which we omit is generalizations of degree theory. A recent 
survey of a-degrees is contained in Chapter C.5. A degree notion which is 
of particular importance for descriptive set theory is discussed in the 
forthcoming Ph.D. theses of J. Steel and W. Wadge, both of the University 
of California at Berkeley. 

The rest of this introduction consists of a brief definition of some of the 
basic notions of degree theory. For more details, the reader may consult 
Section 8 of Chapter C.1. 

We use w to denote both the smallest infinite ordinal and the set 
{0,1,2,...} of nonnegative integers. We use 2” to denote the set of all {0, 1} 
valued functions on w. We use letters such as f,g,h to denote elements 
of 2°. 

For f, g © 2° we use f @ g to denote the unique function h € 2® such that 
h(n) = f(n), h(2n + 1) = g(n) for all n € w. We write f =1g (read: f is 
recursive in g, f is Turing reducible to g, f is computable from g) if there 
exists an algorithm which computes f(n) from n using an oracle for g, for 
all n © w. We assume a fixed Gédel numbering of the algorithms. For all 
f€2° we define f* € 2° by: f*(n)=1 if the n-th algorithm eventually 
halts if started with input n and oracle f; f*(n)=0 otherwise. 
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1.1. Proposition. (i) f =rf. 
(ii) f <rg, g<rh implies f <rh. 
(iii) fg =rh if and only if f=rh, g = rh. 
(iv) f <rf* and not f* <rf. 
(v) f <rg implies f* =7g"*. 


Two functions f, g © 2® are said to have the same degree if f=;g and 

1 f. The set of all degrees is denoted D. Boldface letters such as a, b, c, d 

are used to denote degrees. The degree of f is denoted deg(f). The degree 
of a set A Cw is the degree of its characteristic function c,4 € 2”. 

Let a = deg(f), b = deg(g). A binary relation < on D is defined by 
a=b if and only if f=1g. A binary operation U on D is defined by 
a U b = deg(f @g). By 1.2 (ii) and 1.2(v), D is an upper semilattice under 
U. A unary operation j : D — D, called the jump operator, is defined by 
j(a)=a'=deg(f*). A distinguished element of D is defined by 0= 
deg(An -0). Thus 0 is the degree of recursive functions. 


1.2. Proposition. (i) D has cardinality 2”. 
(ii) D is partially ordered by =. 
(iii) A subset of D has an upper bound (under =) if and only if it is 
countable. 
(iv) 0 is the least element of D. 
(v) a Ub is the least upper bound of a, b. 
(vi) a<a’ for all a. 
(vii) a = b implies a's b’. 


The jump operator will play a fundamental role in this chapter. We 
therefore mention two important characterizations of it: 

(1) A subset of w is said to be recursively enumerable in a degree a if it is 
either empty or the range of a function p : # > w which is recursive in a. A 
degree b is said to be r.e. ina if b is the degree of a set which is recursively 
enumerable in a. Then: a’ is the largest degree which is r.e. in a. 

(2) We say that f is limit recursive in g if there exists a function 
p:@ X w—vw such that p is recursive in g and f(m) = lim, p(m, n) for all 
m & w. Then for a = deg(f), b = deg(g), we have: a = b’ if and only if f is 
limit recursive in g. 

If a = deg(f) and n € a, we write a“ = deg(f™) where f= f, fo = 
(f)*. Thus a® = a and a“*” = (a) so in effect we have defined finite 
iterates of the jump operator. The beginning of an extension into the 
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transfinite is obtained by putting a“ = deg(f©’) where f“ € 2” is defined 
by 


fo" (n+ 1)— 1) = fn) 


for all m,n ©. Note that the degree a“ is an upper bound of the 
increasing sequence a,a’,a”,...,a™,a*”,... (n € w). Further transfinite 
iterates of the jump operator will be defined in Section 4. 


2. Structure of the degrees without jump 


In this section we discuss the structure of D, the set of all degrees, 
regarded as either a partially ordered set (D, =) or an upper semilattice 
(D, U). Results whose statements mention the jump operator are reserved 
for later sections. The trend of the results in this section is that the structure 
of D is very rich, indeed sometimes intractable, but has a number of 
pleasing properties such as 2.2 and 2.3. 

We begin by discussing the types of suborderings of D. Let S be a 
partially ordered set which is embeddable in D. Then trivially, by 1.2, we 
have: 

(i) the cardinality of S is = 2”; 

(ii) every element of S has at most countably many predecessors. A 
problem of Sacks [1963] which remains open is whether the converse is 
true, i.e. whether every partially ordered set satisfying (i) and (ii) is 
embeddable in D. The main positive result on this problem is the following, 
which incidentally solves the problem completely if the continuum 
hypothesis holds. 


2.1 THEOREM (SAcks [1963]). Let S be a partially ordered set which satisfies 
conditions (i), (ii) and (iii): every element of S has at most N, successors. 
Then S is order-embeddable in D. 


Of course a corollary of 2.1 is that every countable partially ordered set 
is embeddable in D. 

In the proof of 2.1, an embedding of S into D is obtained as the limit of a 
transfinite sequence of extensions to successively larger subsets of S. This 
suggests a difficult general problem: when can an embedding of a partially 
ordered set into D be extended to an embedding of a larger, partially 
ordered set into D? An interesting special case of this problem involves 
independent sets, i.e. sets of nonzero degrees such that no element of the set 
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is less than or equal to the least upper bound of finitely many other 
elements of the set. By Zorn’s Lemma, any independent set can be 
extended to a maximal independent set. It is not hard to show that there 
exists an independent set of cardinality 2", and that every maximal 
independent set has cardinality =N, (more generally =N,., assuming 
Martin’s axiom for N.). Sacks [1963] asks: does every maximal indepen- 
dent set have cardinality 2°? The answer to this question may well be 
independent of ZFC. 

We now introduce the important topic of ideals in D. A set IC D is 
called an ideal if 

(i) OE T; 

(ii) a= b ETI implies a ET; 

(iii) a,b EI implies a Ub ET. 
For example, if a is any degree then we have the principal ideal 


I(a)={d | d<a}. 


Note that any principal ideal is countable. A nonprincipal, countable ideal 
is generated by any countable ascending sequence of degrees, e.g. 
{0 | n € w}. Another example of a nonprincipal ideal is D itself. 

Our use of the word ideal is not standard terminology, but it is certainly 
natural, since conditions (i)-(iii) are equivalent to saying that I is the kernel 
of a homomorphism of (D, U,0) onto another upper semilattice with 
distinguished least element. Later we shall have more to say about the 
images of these homomorphisms. 

For the moment we regard an ideal as simply a subordering of D and 
ask: what can we say about its order type? Clearly an idgal is an upper 
semilattice with a least element, but can we say more? The answer in the 
countable case is very pleasing: 


2.2, THEOREM (LACHLAN and Leseur [1976]). Every countable upper 
semilattice with a least element is isomorphic to a countable ideal in D. 


An important special case of 2.2, due to Spector [1956], is that there 
exists a minimal degree, i.e. a degree m such that 0 is the unique degree 
less than m. 

The proof of 2.2 is combinatorially intricate. However, the basic 
recursion-theoretic idea of the proof goes back to Spector [1956] and is not 
difficult. The history of 2.2 and a major contribution to the combinatorial 
part of its proof are in LERMAN [1971]. 

Uncountable generalizations of 2.2 are largely mysterious; e.g., it is open 
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whether D has an ideal of order type N,. YATES [1970] remarks that this 
latter statement is consistent’ with ZFC, and makes conjectures. One 
isolated fact is that D has an ideal which is order-isomorphic to the lattice 
of finite sets of real numbers (cf. THOMASON [1970]). An easy corollary of 
this fact is that every maximal, pairwise incomparable set of degrees has 
cardinality 2" (Sacks [1963]). 

The next theorem says something important about the way a countable 
ideal sits in D. 


2.3. THEOREM (SPECTOR [1956]). Let I be any countable ideal in D. Then 
there exists a (nonunique) pair of degrees a,, a2 such that 


I={d | d<a, and d < a2}. 


In the special case of principal ideals, 2.3 tells us that every degree is the 
greatest lower bound of two larger degrees. A corollary of 2.3 for 
nonprincipal ideals is that an infinite ascending sequence of degrees can 
never have a least upper bound. 

In general, a pair a, az satisfying the conclusion of 2.3 is called an exact 
pair for I. The existence of exact pairs implies that in the first-order theory 
of (D, U) we can talk about and quantify over arbitrary countable ideals. 
This expressive power is exploited in the proof of Theorem 2.7 below. 
Further exploitation of exact pairs occurs in Theorem 4.3. 

We now consider what happens to D when we ‘“‘mod out” by a countable 
ideal I. Intuitively, the idea is to pretend that all functions in 


M; = {f | deg(f) € I} 


are recursive, and ask what effect this identification has on the structure of 
D. There are two distinct ways to make this idea precise: 

(1) Define an equivalence relation =; on D by: a =,b if and only if 
a Ud=b6U4d forsome d € I. Let D/I be the set of all equivalence classes. 

(2) Let D, = {a | a>d@ for all d€ J}. 

Both D/I and D, are upper semilattices. D/I is algebraically more 
natural since it has a least element, viz. I itself. D/I is in fact just the 
quotient of D by I in the category of upper semilattices with distinguished 
least element. By 2.3 D, can never have a least element. (However, there is 
a natural isomorphism of D,; onto a certain upward closed subset of D/T. If 
I is principal, then this subsef consists of the nonzero elements of D/I.) 


‘ The metalanguage for this paper is ZFC, Zermelo—Fraenkel set theory with the axiom of 
choice. When discussing questions of consistency and independence, we tacitly assume that 
ZFC is consistent. 
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An interesting, general phenomenon is that many structural results 
about D hold also for D/I and D,; when I is a countable ideal, with only 
slight changes in the proofs. For instance, 2.1 is true verbatim with D 
replaced by D/I or D,, I countable. The routine procedure, whereby 
theorems and their proofs are generalized from D to D/I or D,, is called 
relativization to I (cf. Section 9.3 of RoGers [1967a]). The validity of this 
procedure rests on the fact that M; has many of the same closure properties 
as the set of recursive functions; all the more so when I is principal. 
Usually one speaks of relativization to a degree a rather than to the 
principal ideal I(a). The relativized versions of many theorems such as 2.2 
have been verified for principal ideals (but sometimes not for arbitrary 
countable ideals, except in special cases). 

A tantalizing conjecture of RoGers [1967a], p. 261) is that for all 
principal ideals I(a), D/I(a) is isomorphic to D. Certainly D/I(a) and D 
share many structural properties. It is also natural to consider the following 
generalization of Rogers’ conjecture: for all countable ideals [, 


D, = D/I — {I} = D — {0}. 


Two further open problems, also due to Rocers [1967b], are whether any, 
or every, nonzero degree is invariant under all automorphisms of (D, U). 
All of these Rogers problems are wide open. (If we change the problems by 
requiring isomorphisms and automorphisms to be jump-preserving, then 
considerable progress has been made. See Corollaries 3.6 and 3.8 below.) 
So far in this section we have safely ignored the jump operator. We now 
wish to point out that there are many theorems, about the upper 
semilattice structure of D, which do not mention jump but whose proofs 
use it. Perhaps the simplest example of such a theorem is the following. Say 
that a degree b splits if it is the least upper bound of two smaller degrees b, 
and b2. Not every nonzero degree splits, e.g. a minimal degree. However: 


2.4. THEOREM. Ja Vb (b >a — Bb splits). 


This is proved as follows. Take a = 0’. Given b > 0’, use Friedberg’s 
jump-inversion theorem 3.1 to find a degree ¢ such that e’=c U0'= 5. 
Then put b,=c, b,=0'. 

The next two theorems strengthen 2.4 and, like 2.4, are derived from 
theorems about the jump operator. Say that b is cuppable if for all nonzero 
b, <b there exists b,< 6 such that b, Ub.= b. 


2.5. THEOREM. Ja Wb (b >a — b is cuppable). 
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2.6. THEOREM. Ja Vb (b >a— b is the least upper bound of two minimal 
degrees). 


For 2.5 we take a = 0’ and use 3.2 and a relativized version of Robinson’s 
theorem 5.3. For 2.6 we take a = 0" (cf. Theorem 2.4 of Martin and 
MILLER [1968]). 

We end this section by giving a complete answer to a question which 
seems to have first been raised by Shoenfield (cf. SHOENFIELD [1965}). The 
question is: how complicated is the first-order theory of the upper 
semilattice (D, U)? 


2.7. THEOREM (SIMPSON [1977]). The first-order theory of (D, U) is recur- 
sively isomorphic to the truth set of second-order arithmetic. 


Here is a sketch of the proof. By 3.4 we have the same result for the 
expanded structure (D, U,/) where j is the jump operator. In the proof of 
3.4, the jump operator is used only to show that there exist certain 
configurations of degrees which encode second-order arithmetic. By using 
2.3 we can speak about these configurations in the first-order theory of 
(D, U). 


3. The jump operator 


In the previous section, the jump operator j: D— D occurred as an 
auxiliary notion, used only in order to prove theorems which did not 
mention it. We now study the jump operator for its own sake. The basic 
result here is Friedberg’s inversion theorem: 


3.1. THEOREM (FRIEDBERG [1957b]. For every degree b =0' there exists a 
such that a'=a U0'= 5. 


An easy corollary of 3.1 and 1.2 is that the range of the jump operator is 
precisely the set of degrees = 0". 

There is an extensive literature on the jump operator. Much of this 
literature concerns refinements of 3.1 in various directions. For instance, a 
simple modification of the proof of 3.1 yields the following useful result 
(see also Spector [1956]): 


3.2. THEOREM. For any b = 0! there exists an infinite set {a; | i € w} such 
that a; a; =0 for all ix j, and a, =a, U0’ = 5b for all i. 
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Other refinements of 3.1 involve restricting the domain of the jump 
operator. Thus, in Section 6 of Sacks [1963], it is shown that 


{a'|0<a<0O' anda isr.e}= 
={b | b =0' and b isr.e. in 0}. 


This beautiful result is also important historically because its proof 
contained the first major application of the so-called ‘‘infinite injury” 
priority method. Yet another striking refinement of 3.1, whose proof 
incidentally also requires a kind of infinite injury argument, is the 
following: 


3.3. THEOREM (Cooper [1973]). For any b=0' there exists a minimal 
degree m such that m'= m U0'=b. 


We now discuss some recent results concerning first-order definability 
and automorphisms of the structure 


QD =(D, v,j). 


By second-order arithmetic we mean the first-order theory of the two- 
sorted structure 
S =(2°,0, +,:,E) 

where + and - are the usual arithmetical operations on w, and E : 2° x 
w — w is defined by E(f,n)= f(n). Good introductions to the recursion 
theoretic literature on second-order arithmetic are Section 16.2 of ROGERS 
[1967a] and Section 8.5 of SHOENFIELD [1967]. We are going to discuss a 
certain translation of the language of second-order arithmetic into the 
language of @. To this end we need a special mapping I”: D > 2” defined by 


1 ifo"*?<dU0™, 


0 otherwise. 


rave) =| 


3.4. THEOREM (SIMPSON [1977]). Let p(f1,..., fi, m,..-,n;) be a formula of 
second-order arithmetic. Then we can effectively find a formula 
op *(X1,...,%y Y1,--+, yj) Of degree theory, such that for all d,,...,d,; & Dand 
N,...,n Eo, 


SF g[E(d,),..., F(a), m,-.-5 1) 
if and only if 
DE o*[d,,...,4,0%,..., 0]. 
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Proor (sketch). Formulas of first-order arithmetic are handled by showing 
that the set 2 = {0 | n € w} and the relations {(0,0™,0™) | m+n= 
p} and {(0,0, 0°) | m -n = p} of “addition” and “multiplication” on 
, are first-order definable in @. Function quantifiers are handled by 
showing that [ maps D onto 2°. 


In order to draw some corollaries about automorphisms and definability 
in 2, we need the following rather easy proposition. Recall that the degree 
0“ is a canonical upper bound to {0™ | n € w}. 


3.5. PRoposiTion (Simpson [1977]). The relation {(a,b) | 0 <b = 
deg(I"(a))} is first-order definable in Q. 


3.6. CoROLLARY (Solovay). Every degree =0~ is fixed by all automor- 
phisms of 2. 


3.7. CoROLLARY (Simpson [1977]). An n-ary relation R C{d | 0 = d}" is 
definable in @ if and only if 


R*={(fi,..., fa) | R(deg(f:),..., deg(f, ))} 


is definable in second-order arithmetic. 


3.8. COROLLARY (Simpson [1977]). The substructure of 2 whose universe is 
{d | 0 < d} is not elementarily equivalent to Q. 


Corollaries 3.6, 3.7 and 3.8 are fairly immediate consequences of 3.4 and 
3.5. A different proof of 3.6 was discovered earlier by R.M. Solovay using 
observations in Section 5 of Yates [1972]. Jockusch has noted that these 
same observations show that in 3.6, 0“ can be improved to 0. It remains 
open whether @ has any automorphisms other than the identity. 


4. V=L versus PD 


In this section we discuss some aspects of the correlation between degree 
theory and axiomatic set theory. We begin with some historical remarks. 

In the earliest days of forcing, many degree theorists noticed the strong 
resemblance between Cohen’s method and the well-developed technique 
of finite approximation in degree theory. Somewhat later, Sacks used 
analogies with degree theory to suggest new results and problems in 
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axiomatic set theory (“degrees of nonconstructibility”, Sacks [1971]). 
However, direct connections between degrees and axiomatic set theory 
have come to light only very recently. To my knowledge, the literature 
prior to 1968 contains no hint of direct connections (although Sacks [1963] 
mentions some applications of measure theory and descriptive set theory). 

The earliest papers in which concepts of axiomatic set theory are applied 
directly to degrees are BooLos and PuTNaAM [1968] and Martin [1968]. 
Subsequently, the correlation was found to be unexpectedly close and 
detailed. For instance, it follows from 3.7 that there is a first-order formula 
w(x) of degree theory, such that for all degrees d, 2 y[d] if and only if d 
is constructible in the sense of GOpEL [1939]. 

In our opinion, axiomatic set theory offers one of the most interesting 
problem areas for future research on degrees. (In fairness we point out that 
many of our fellow degree-theorists do not share this opinion.) We wish to 
present here some of the known results showing that various set theoretical 
hypotheses, such as V=L (the ‘‘axiom”’ of constructibility) or PD (the 
“axiom’’ of projective determinacy), have striking consequences for the 
structure of the degrees. 

A set of degrees X C D is said to be determined if there exists a degree a 
such that either {d | d=a}C X or {d | d=a}M X =9. Not every set of 
degrees is determined. However, there are many nontrivial theorems in the 
literature, each saying that some specific set of degrees is determined, one 
way or the other. For example, each of 2.4, 2.5, 2.6, 3.1 and 3.3 expresses an 
instance of determinacy. 

Not only are many familiar subsets of D known to be determined, but 
the reader (even the expert degree theorist) will probably be unable to 
define in ZFC a set of degrees such that he can prove in ZFC that the set is 
not determined. These empirical phenomena lead to the formulation of the 
following heuristic principle: 


4.1. Heuristic Principce. Let X be a ‘‘simple-minded”’ subset of D. Then 
X is determined. 


There is a close connection between determinacy and Gale—Stewart 
games. For general information about such games, the reader may consult 
Chapter C.8. The specific connection is as follows. Let X be a subset of D. 
Let two players, I and II, play an infinite game with perfect information, 
leading to the definition of a function f € 2°. The rules of this game are that 
I picks f(0), then II picks f(1),..., then I picks f(2n), then II picks 
f(2n+1),.... Finally I wins if deg(f)€ X, II wins otherwise. It can be 
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shown that X is determined if and only if one of the players has a winning 
strategy. (Furthermore, the degree a mentioned in the definition of 
determinacy can be taken to be the degree of a winning strategy.) This 
remark is due to MARTIN [1968]. 

Many true instances of Principle 4.1 can be verified by using the remark 
of the previous paragraph, plus recent results on Gale-Stewart games. For 
X CD let us write X* = {f | deg(f) € X}. Then we have: 


4.2. THEOREM (MartTIN [1975, 1970, 1968]). A set X CD is determined 
under any of the following circumstances: 
(i) X* is Borel (i.e. Ai). 
(ii) X* is analytic (i.e. Xi) and a Ramsey cardinal exists. 
(iii) X* is projective (i.e. &) for some n © w) and PD holds. 


From axiomatic set theory, we know that it is impossible to prove the 
consistency with ZFC of the existence of a Ramsey cardinal or of PD. 
However, the inconsistency with ZFC of these hypotheses has not been 
proved. 

In order to draw out an interesting consequence of 4.2(i), let {~n}new be 
an enumeration of the sentences of the first-order language of upper 
semilattices. Since each principal ideal I(b)={d | d <b} is an upper 
semilattice, we may define the sets X, = {b € D | I(b)  ¢,}. By 4.2(i), X, 
is determined. Let 


g. if Ja{b | b=a}CX,; 
Yn, = 


ag, if da{b | b =a} NX, =9. 


Then {,},c. is the theory of a “typical” principal ideal. I-e., there exists a 
degree a such that I(b)F y, for all b=a, nEw. 

Let us say that a set X CD is (weakly) definable if it is first-order 
definable in the structure 9 = (D, U, /) (allowing parameters from D). By 
4.1 it is natural to ask whether every weakly definable set is determined. If 
PD holds, then this is the case by 4.2(iii). (In fact, 3.7 plus a recent result of 
Harrington make it appear probable that PD is equivalent to the assertion 
that every weakly definable subset of D is determined.) However, the 
negative answer is known to hold in a wide class of models of ZFC. 
Specifically, let Y= {d € D | the hyperdegree of d is a minimal cover}. 
Then by 3.7 or Jockuscu and Simpson [1976] Y is definable, but by Smmpson 
[1975] Y is not determined if V =L or a generic extension of L. 
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We have just pointed out that some pleasing results about 9 are gotten 
by assuming PD. We now wish to point out that the picture of 9 which we 
get by making the diametrically opposite assumption, V = L, is also very 
pretty. 

Without any special assumptions, it can be shown that there exists a 
transfinite increasing sequence of degrees 0“, a < Ny. (See also BooLos 
and PutNnaM [1968] and JENSEN [1972].) The degrees in this sequence are 
naturally identified as transfinite iterates of the jump operator, i.e. we have: 

(i) 0 =0; 

(ii) 0°*” = jump(0) for all a; 

(iii) 0 <0 for a < B; 

(iv) for each limit ordinal A < Nt, 0 is the “least natural’ upper bound 
for {0 | a <A} in D. 

Moreover, if V =L, then the degrees 0°’, a < Nr are exhaustive in the 
sense that the ideal which they generate is all of D. In particular, these 
degrees form a natural example of an undetermined set. 

In order to explain clause (iv) above, recall that by 2.3 the degrees 0“, 
a <A, can never have a least upper bound. What (iv) says is that 0° is a 
particular upper bound which arises naturally in terms of the algebraic 
structure of @=(D, U,j). Let us write I, for the countable ideal 
generated by {0 | a <A}. We now describe (in a special case) the exact 
nature of the degree theoretic dependence of 0°” on . (See also Boyp, 
HENSEL and PuTNAM [1969] and JockuscH and Simpson [1976].) If I is any 
countable ideal in D, a degree d is said to be n-exact over I (where n is a 
positive integer) if d is the unique least degree of the form (a U b)” where 
a, b is an exact pair for I. Let Bo be the ordinal of ramified analysis, which 
can also be defined as the least ordinal B such that Lg is a model of ZFC 
minus the power set axiom. It is well known that Bo is a rather large, but 
countable, ordinal number. 


4.3. THEOREM. For any limit ordinal A < Bo we can find a least positive 
integer n =n, such that there exists a degree which is n-exact over I,. 
Moreover 0”? is this unique degree. 


As A ranges over the limit ordinals less than Bo, it can be shown that nm 
ranges over all the integers n = 2. Furthermore, if A is Bo itself, then n, is 
undefined. It is possible to extend 4.3 beyond Bp by considering a notion of 
v-exactness where v is an ordinal. In order to continue beyond Nf, it is 
necessary to supplement Jensen’s “‘fine structure of L” (JENSEN [1972]) with 
Solovay’s “fine structure of L(w)’’ (unpublished). 
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5. Between 0 and 0’ 


The results which were mentioned in previous sections belong to what 
might be called ‘“‘global degree theory”, i.e. they deal with general, 
structural properties of (D, U) and (D, U,j). We now turn to “local 
degree theory”, i.e. the theory of the degrees = 0’. 

A subset of w is said to be recursively enumerable if it is empty or the 
range of a recursive function from » into w. A degree is said to be r.e. if it is 
the degree of a recursively enumerable subset of w. It is easy to see that 0 
and 0’ are r.e. degrees, and that every r.e. degree is = 0’. R.e. degrees were 
introduced by Post [1944] who posed the following famous problem: 
prove that there exists an r.e. degree between 0 and 0’. Post himself made 
some progress, but the problem remained unsolved until FRIEDBERG 
[1957a] and Mucunik [1956]. Ideas flowing from Post’s problem and its 
solution have played a dominant role in local degree theory up to the 
present day. 

The methods of proof in local degree theory are very sophisticated. For 
each of the theorems mentioned in this section, the general pattern of 
proof is as follows. A recursive construction in infinitely many stages is 
performed. There are infinitely many requirements to be satisfied. From 
time to time in the course of the construction, some of these requirements 
are seen to come into conflict with one another. The conflicts are resolved 
when they arise, according to a recursive scheme of priorities. Since 
requirements can be injured at various stages, a delicate argument is 
needed to show that each requirement is finally met. 

A proof in the pattern just described is called a priority argument. The 
first such argument was used in the Friedberg—Muchnik solution of Post’s 
problem. Extensions of the Friedberg-Muchnik method were developed 
for other problems by later workers, among them Lachlan, Sacks, and 
Yates. In keeping with our general policy of concentrating on results rather 
than methods, we shall say no more about priority arguments here. The 
interested reader may consult the expository masterpieces of LACHLAN 
[1973], SHOENFIELD [1971], and SoarE [1976]. 

An ultimate solution for Post’s problem would be a complete determina- 
tion of the structure of ®, the countable partially ordered set of r.e. 
degrees. No such solution is in sight, not even a viable conjecture. The few 
known results are of a fragmentary nature. A landmark is the density 
theorem of Sacks [1964]: if a < b are r.e. degrees, then there exists an r.e. 
degree ¢ such that a<c<b. 

Other results about ® concern its lattice structure. Trivially, R is an 
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upper semilattice. Every countable distributive lattice is lattice- 
embeddable in @ (THOMASON [1971], LACHLAN [1972]). ® itself is not a 
lattice (LACHLAN [1966], YATEs [1966]). Every nonzero r.e. degree is the 
least upper bound of two smaller r.e. degrees (Sacks [1963]). Not every r.e. 
degree <0’ is the greatest lower bound of two larger r.e. degrees (LACHLAN 
[1966]). There exists a pair of nonzero r.e. degrees with greatest lower 
bound 0 (LAcHLAN [1966], YATEs [1966]). There does not exist a pair of 
nonzero r.e. degrees with greatest lower bound 0 and least upper bound 0’ 
(LACHLAN [1966]). There are a few more results in this vein, due mostly to 
Lachlan, Yates, and R.W. Robinson. A good survey of the known results is 
contained in Cooper [1974]. It is not clear where these results are leading. 

Nothing is known about automorphisms of & or about the complexity of 
the first-order theory of R. Sacks [1963] conjectures that the first-order 
theory of ® is decidable. 

People have often thrashed about looking for meaningful subclasses of 
the r.e. degrees. The most useful discovery so far has been the following 
classification according to finite jumps. For n © w we define 


H, = {d | d=<0' and d™ =0"*"}, 
L, ={d | d <0’ and d™ = 0}. 


Degrees in H, (L,) are sometimes called high (low). The hierarchy 
{H,, La}new iS known to be proper but not exhaustive: H,..—H, and 
Ly+1— L, each contain r.e. degrees, but there exist r.e. degrees which are 
not in any H, or L, (Sacks [1967]). 

Several striking applications of the {H,, L,} hierarchy are known. We 
mention here only one type of application. A recursively enumerable set 
A Cw is said to be maximal if w — A is infinite but there is no recursively 
enumerable set BDA such that both w —- B and B-A are infinite. 


5.1. THEOREM (MARTIN [1966]). An r.e. degree contains a maximal set if 
and only if it is Hy. 


§.2. THEOREM (LACHLAN [1968b], SHOENFIELD [1976]). An r.e. degree con- 
tains a coinfinite recursively enumerable set with no maximal superset, if and 
only if it is not Lo. 


Theorems 5.1 and 5.2 bear on the general problem of relating the degree 
of a recursively énumerable set to its position in the lattice $ of recursively 
enumerable sets ordered by inclusion. This problem goes back to Post 
[1944]. Further results on this problem are in SoarE [1974, 1975] where in 
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particular it is shown that H,, L, are not invariant under automorphisms 
of @. 

In recent years, interest has grown in the degrees = 0’ other than r.e. 
degrees. One of the basic results here is due to Sacks [1963]: there exists a 
minimal degree less than 0’. (By the density theorem for r.e. degrees, no 
minimal degree can be r.e.) Beautiful refinements of Sacks’ theorem have 
been obtained by Cooper, Yates, R. Epstein, and L. Sasso. For a recent 
survey of this developing area, see YATES [1974] and the annotated 
bibliography of Cooper [1974]. 

Yates [1970, 1972] has announced that every finite distributive lattice is 
order-isomorphic to a principal ideal I(a) = {d | d <a} such that a <0’. 
This result implies that the first-order theory of the upper semilattice I(0’) 
is undecidable (see also LACHLAN [1968a], THOMASON [1970]). We conjec- 
ture that the first-order theory of I(0’) is recursively isomorphic to the truth 
set of first-order arithmetic (cf. 2.7). 

An apparently difficult open problem is whether to every degree a = 0' 
there exists a relative complement, i.e. a degree c such that a Uc = 0’ and 
afc =0. A beautiful result in this direction is: 


5.3. THEOREM (ROBINSON [1972]). If a and b are nonzero degrees =0', then 
there exists a degree c such that a Uc =0' and not b=c. 


6. Degrees of complete theories 


A set C C 2° is said to be co-recursively enumerable (co-r.e.) if 2” — C is 
the domain of a partial recursive functional from 2” into w (see Chapter 
C.1). The basis problem is the problem of effectively choosing an element 
from a nonempty, co-r.e. subset of 2°. In the present section, a number of 
results on the basis problem will be presented. 

The reason for the great interest in the basis problem is that co-r.e. sets 
often arise naturally in the practice of mathematics. We focus here on a 
particular class of examples taken from mathematical logic. Let T be a 
finitely axiomatizable theory in a first-order language of finite similarity 
type. Assume a fixed Gédel numbering of the sentences of the language of 
T. A complete theory in this language will be identified with the charac- 
teristic function of the set of Gddel numbers of its theorems. Thus the set 
of all complete extensions of T is identified with a set C; 2°. As an 
abstract topological space, Cy; is just the Stone space of the Boolean 
algebra of sentences of the language of T modulo provable equivalence in 
T. As a subset of 2°, Cy is easily seen to be co-r.e. 
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We digress to quote the following theorem which shows that the class of 
examples just discussed is adequate, i.e. comprises essentially all co-r.e. 
subsets of 2”. (Thus, when we study the basis problem in terms of degrees, 
we are really studying degrees of complete extensions of finitely axiomatiz- 
able theories.) Two sets Ci, C,C 2® are said to be recursively homeomor- 
phic if there exists a partial recursive functional from 2” into 2° which 
maps C, one-one onto C). 


6.1. THEOREM (HAnF [1975]). Let C be any co-r.e. subset of 2”. Then there 
exists a finitely axiomatizable theory T in the first-order language with 
equality and one binary relation symbol, such that C is recursively 
homeomorphic to the set of complete extensions of T. 


In Jockuscu and Soare [1972], priority arguments are used to prove the 
existence of co-r.e. sets C € 2” with various pathological properties. We 
can then apply Theorem 6.1 to deduce corollaries about the existence of 
finitely axiomatizable theories T with corresponding pathological proper- 
ties. For more information on this topic see HAnF [1965] and Martin and 
Pour-Ex [1970]. 

Our results concerning the basis problem will be stated concisely in 
terms of a relation << on degrees which we now define. A set C € 2” is 
said to be co-r.e. in a degree a if 2”—C is the domain of a partial 
functional from 2° into w which is recursive in a. We put a<<b if every 
nonempty subset of 2° which is co-r.e. in a contains a function of degree 
= b. The next theorem is a collection of simple facts about << whose 
origins are difficult to trace. 


6.2. THEOREM. (i) a<<b implies a < b. 
(ii) a= b<<c <d implies a<<d. 
(iii) a<<b, b<<c implies a<<e. 
(iv) a<<a’ for all a. 


The next theorem explains the choice of notation <<. 


6.3. THEOREM (JockuscH and Soare [1972]). If a<<b, then every count- 
able partially ordered set is embeddable in {d la <d <b}. 


The next theorem says that there is no positive correlation between << 
and relative recursive enumerability, except as noted in 6.2(iv). 
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6.4. THEOREM (Jockuscu and Soare [1972]). (i) Ifa<<band bis r.e. ina, 


then a’ = b. 
(ii) For all a there exists b such that a<<b and a'= b’. 


The next theorem expresses two striking structural properties of <<. 


6.5. THEOREM. (i) If a<<b, then there exists ¢ such that a<<c, e<<b. 
(ii) For all a and b =a, there exists c>>a such thata=b(\c. 


We finish with some results relating << to more familiar concepts. For 
convenience we state the results in unrelativized form. The next theorem is 
derived from Scotr [1962], JockuscH and Soare [1972], and an unpub- 
lished result of R.M. Solovay. (See also Rocers [1967a], p. 94.) 


6.6. THEOREM. For any degree b, the following assertions are pairwise 
equivalent: 
(i) b>>0; 
(ii) b is the degree of a complete extension of Peano arithmetic (or of 
ZFC, assuming ZFC is consistent); 
(iii) b is the degree of a set which separates an effectively inseparable, 
disjoint pair of recursively enumerable sets. 


6.7. THEOREM (JockuscH [1972]). For any degree b, the following assertions 
are equivalent: 

(i) either b>>{0 or b'=0"; 

(ii) b is the degree of a sequence of functions containing all the recursive 
elements of 2°. 


A degree b is said to be almost recursive if for every function p: w > 
which is recursive in b, there exists a recursive function q : # — w such that 
p(n)<q(n) for all n € w. (See also MARTIN and MILLER [1968].) 


6.8. THEOREM (JockuscH and Soare [1972]). There exists a degree b such 
that b>>0 and b is almost recursive. 
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Introduction 


Our goal in this chapter is to give the non-specialist a rough picture of 
the current state of a-recursion theory (i.e., recursion theory on admissible 
ordinals). We hope first (Section 1) to explain the basic ideas underlying the 
subject and then (Section 2) to present the important questions and main 
areas of research currently being pursued. We will also report on the 
progress being made and on the techniques being developed and employed 
in this work. To illustrate these ideas we will (Section 3) give one fairly 
complete proof of a typical theorem in a@-recursion theory and then 
(Section 4) examine the relation and application of these methods to other 
areas of research. Of necessity our treatment of the historical background 
as well as the material itself will be both sketchy and superficial. We have 
tried to include a fairly complete and somewhat annotated bibliography to 
help fill in the gaps. 

Historically the motivation for generalizing recursion theory to infinite 
ordinals came from the several branches of mathematical logic: proof 
theory, model theory and set theory as well as, of course, recursion theory 
itself. Thus, for example, Takeuti [1954, 1957] was interested in the 
problem of reducing the consistency of set theory to that of a theory of 
ordinal numbers. In TaKEuT! [1960] he introduced a recursion theory on 
the ordinals to show that the required theory and relative consistency proof 
could be made effective in some generalized sense. Essentially he showed 
that Gédel’s construction of L (the class of constructible sets) could be 
mimicked in an (ordinal) effective way to give a (recursively) isomorphic 
copy of L within his theory of ordinal numbers (Takeuti [1965a]). In many 
ways this work foreshadowed the close interconnections between recursion 
theory and set theory that arose in the study of the fine structure of L. We 
will discuss this relationship at some length later on. 

In the model-theoretic vein we cite two examples. MACHOvER [1961] 
(who also collaborated with Levy) wanted to generalize model-theoretic 
ideas and results which involved recursion-theoretic concepts to the 
infinitary languages I,.,. He therefore developed a recursion theory on 
regular infinite cardinals to state and prove such theorems as satisfiability is 
not ‘‘arithmetically” definable. In a somewhat different vein we have the 
work of KReIsEL [1961, 1965] (who later collaborated with Sacks in KREISEL 
and Sacks [1963, 1965]). The main interest here seems to be definability 
theory and its relation to various higher logics and languages. This work 
developed along both recursion-theoretic lines (as evidenced in the papers 
with Sacks) which we will continue to discuss and more model-theoretic 
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ones (although with some proof-theoretic flavorings as well). The latter 
stemmed from Kreisel’s insistence on restrictions of infinitary language and 
associated generalizations of finiteness based on definability considerations 
rather than cardinality. In the hands of Barwise and others it has flowered 
into the extensive subject of admissible structures and their associated 
infinitary languages. As this area lies outside the scope of this chapter we 
will only refer the reader to the bibliography for some starting points in the 
literature and to Chapter A.7. 

The set-theoretic viewpoint is represented by the work of Jensen in 
JENSEN and Karp [1971]. (Karp, however, came to the subject from a 
model-theoretic approach, somewhat like Machover’s.) As this approach is 
intimately connected with the recursion-theoretic one represented by 
Kripke [1964a, 1964b] and PLATEK [1966] (as well as the previously 
mentioned work of Kreisel and Sacks), which forms the heart of our 
subject, we will try to follow developments from both points of view. 

In addition to the specific problems from other areas that motivated the 
early work on recursion theory on ordinals, there were also general 
principles involved. The recursion theorists had many of the usual goals of 
generalization in mind. One hopes to build not only new and interesting 
abstract structures but also ones that will illuminate ordinary recursion 
theory and perhaps prove useful in applications as well. A further hope is 
that such work might lead to an axiomatic approach to recursion theory. 
The set-theorist, on the other hand, comes to the subject with the goals of 
effectivization rather than generalization. By introducing notions of effec- 
tiveness he hopes to gain a finer and deeper knowledge of the structure of 
sets and eventually to exploit this extra information to solve already 
existing problems in set theory. 


1. Ideas and definitions 


The question now is where to begin a generalization of recursion theory. 
One must first decide what are the basic objects and notions in ordinary 
recursion theory that one wishes to generalize or abstract. Of course we 
have the natural numbers 0,1,2,3,... as the primary elements of our 
universe, while the basic notions of interest seem to be recursiveness and 
recursive enumerability. One natural generalization for the numbers is 
certainly the ordinals. After all one just keeps on counting past w. One has, 
however, two choices corresponding, perhaps, to two different views of the 
natural numbers. One can take either all the ordinals (ON) or just some 
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initial segment of ordinals up to say @ as the basic objects. We will 
concentrate on the latter approach, though most of what we say will apply 
equally well to both. As for definitions -of recursiveness and recursive 
enumerability, the recursion theorist has several standard alternative 
approaches to consider (though in ordinary recursion theory all of them 
define the same class of functions). Many of these schemes were tried out 
by the early workers in the field. To cite some examples we note that 
TAKEUTI [1960] introduced schemes for generating semi-recursive functions 
and with Kino [1962] put forth an alternate approach that first generated 
primitive recursive functions and then applied a uw (least number) operator. 
Other proposals included analogs of Turing machines (Levy [1963]), 
definability criteria (KREISEL [1965]) and various types of equation calculi 
(MacuHover [1961], TuGuf [1964] and Kripke [1964a)). 

The most useful approach seems to be one based on an equation calculus 
like that in Kripke [1964a]. One takes Kleene’s ordinary equation calculus 
and adds on an infinitary rule that expresses the underlying idea of the 
generalization: one is to be allowed up to @ many steps in a computation 
and to survey up to a@ many bits of information so far produced at each of 
these steps (see KripKE [1967] or Sacks [1967] for further discussion and 
technical details).' One then defines the recursive functions as usual to be 
those whose values can be consistently deduced from some finite set of 
equations via these rules. The recursively enumerable (r.e.) sets are, of 
course, the ones which can be recursively listed. (We will indicate the 
generalized notions by prefixing an @ as in ‘‘a-recursive’” and ‘“‘a- 
recursively enumerable’’.) 

One problem immediately comes to mind with this procedure. Not all 
ordinals @ will be satisfactory domains for this recursion theory. Thus for 
example w + 17 is not closed under addition and should surely be unaccept- 
able. Whatever procedure we use for defining recursiveness, we want at the 
very least to guarantee that @ be closed under recursive functions. 
Thinking specifically in terms of an equation calculus we would also want 
to require that any individual computation, i.e. deduction, of the value of 
some recursive function be completed in fewer than a@ many steps. Both of 
these requirements are met if we assume that for any given finite set of 


' We should remark that some care is needed in the choice of an equation calculus. Kripke 
told us in the name of A. Levy (see Avot 6:6) that the system of MACHOVER [1962] has 
problems if V # L because of the use of infinitely many variables and the supremum operator. 
Indeed, even if one uses only finitely many variables and assumes V = L, the formulations 
using the sup or lub operator of TUGUE [1964] and STILLWELL [1970] have all the desired 
properties only for regular cardinals. 
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equations all possible consequences can be deduced in fewer than a many 
steps. Such ordinals are called admissible (KRIPKE [1964a]) and constitute 
the proper domains for our recursion theories. (We should note that it was 
not until the work of Kripke [1964a] and PLATEK [1966] that the proper 
domains for recursion theory on ordinals were isolated. Most of the other 
workers (e.g. MACHOVER [1961], Levy [1963] and TuGué [1964]) restricted 
their attention to the obviously suitable case of regular cardinals. That 
other domains were possible was realized in TAKEvuT! [1960], where it was 
shown that singular cardinals as well were closed under recursive functions. 
In addition KretseL and Sacks [1965], approaching matters in a quite 
different manner, worked on w‘*, the least non-recursive ordinal which is 
also the least admissible after w.) 

Although it may not be obvious, there is another basic notion that 
requires generalization: finiteness. Early work tended to leave it un- 
changed or (especially in infinitary languages) employ cardinality condi- 
tions (sets of size <a@ were a-finite). The need for a finer approach was 
first stressed in KreIseL [1961]. In our present setting it would certainly 
seem reasonable to require that each of our counting numbers (the ordinals 
less than a) be considered a@-finite. This, however, would not be enough. 
What would we say of sets which were the same size as some B < a? Surely 
if the correspondence between elements were effective it too would be 
a-finite. Is this, however, sufficient? As it turns out it helps to turn to a 
set-theoretical viewpoint to answer this question. Let us therefore consider 
for a moment how a set-theorist might generalize recursion theory into the 
infinite. 

A natural starting point for someone interested in effectivizing set theory 
is Gédel’s constructible universe (see Chapter B.5). It, after all, consists of 
those sets built by the most effective method of ordinary set theory — 
definition. Again we have here two choices for our domain of discourse. 
We can work with all of L or with some initial segment L, (the sets 
constructed by level a). As before we will concentrate on the latter 
alternative mostly for the sake of notational convenience. Given such a 
domain L, built up by definitions, it seems reasonable to look at 
complexity of definition as a guide to deciding which subsets of, or 
functions on, L, are to be considered simple or effective. If we combine 
this idea with the one that a recursively enumerable set is one given by a 
search aperation or enumeration, then a good candidate for recursive 
enumerability is being 2, over L. (i.e. definable by a formula with 
parameters in L, and a single initial existential quantifier as the only 
quantifier not restricted in its scope to an element of L.). The idea of 
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course is that an element is put in a 2, set when we find (in our search 
through L, ) a witness for the existential quantifier. (Checking the truth of 
such a sentence only involves looking inside a single element (say an Lg for 
B <a) of L. and so is surely effective.) With a definition of recursive 
enumerability at hand we can define a subset of L, to be recursive if both it 
and its complement are r.e. (i.e. it is A,). 

We are again faced with the problem that not all ordinals a will 
correspond to an L, suitable for our recursion theory. Thinking now as 
set-theorists we might try to remedy this situation by requiring that L. 
satisfy the axioms of set theory in some effective form. In particular this 
means that in addition to the usual trivial axioms (extensionality, empty 
set, pairing and union) which are satisfied by any L,, with A a limit, we 
might ask that the replacement axiom (or equivalently the Aussonderung 
axiom) hold for effective functions (= A, = 2, with domain an element of 
L.). We do however give up the power set axiom entirely. As it turns out 
this requirement of %,-replacement is equivalent to the recursion- 
theoretically defined notion of admissibility (see Kripke [1964a] or 
Fukuyama [1971] for the actual details of the equivalence). Thus we see the 
beginnings of the relationship between the two approaches. 

The connection between these two formulations is in fact much more 
thoroughgoing than this one coincidence. Indeed they are essentially the 
same. For admissible @ there is an a-recursive relation on a@ which is 
isomorphic to the € relation on L,. Moreover, all the basic notions are 
carried over by the translation. Thus a set is a-recursive, or a-r.e., iff its 
image is A, or 2, over L, respectively. This equivalence can be used to 
guide us to a good answer to our question about the proper definition for 
a-finite. For the set-theorist the natural genealization of finite to element 
of L, makes good sense. Upon translating this back to recursion-theoretic 
language we discover that it is equivalent to being the recursive image of an 
ordinal less than a. This equivalence is then good evidence for the 
adequacy of our definition of a-finite. 

We should also point out that when this problem of generalized 
finiteness was first raised by Kreisel it was with special emphasis on the 
analogy between ITj and r.e. in ordinary recursion theory. The proposal 
(KREISEL [1961]) was to have Aj correspond to finite rather than recursive 
to improve the analogy. When Kreiset and Sacks (1963, 1965] formulated 
these ideas in terms of recursion theory on w‘* they accordingly general- 
ized finite to recursive and bounded. (IIi sets indeed become w‘*-r.e. and so 
Aj subsets of w were w‘{*-recursive and bounded.) This definition is also 
easily seen to be equivalent to that of a-finite. 
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2. Questions and answers 


Now that we have agreed on the primary notions of a-recursion theory 
we must consider what sort of questions should be attacked. After 
establishing the basic facts of ordinary recursion theory, such as the 
enumeration, iteration and recursion theorems, the recursion-theorist is 
naturally drawn to the traditional areas of investigation of ordinary 
recursion theory: relative complexity (that is degree theory) particularly of 
the r.e. sets and structural (that is lattice-theoretic) properties of sets, again 
usually of the r.e. sets. These areas have recently constituted the major 
concerns of a-recursion theory. In addition to establishing the main 
structural theorems (for degrees as well as sets), the goal of much of the 
current work has been to analyze and illuminate the methods used in these 
areas. In particular the various types of priority arguments that are the key 
to most of the deeper results of ordinary recursion theory have been 
extensively studied and adapted to the general setting of a-recursion 
theory. We will just mention some of the results in each area, but we must 
first pause to explain what we mean by a-degrees and a-reducibilities. 

There are a number of different reducibilities of interest in ordinary 
recursion theory. Some of them such as many-one and one-one re- 
ducibilities have useful straightforward generalizations in a-recursion 
theory: A =,.B (A <:-«B) iff there is a (one-one) a-recursive function f 
such that B € A © f(8)€ B. On the other hand the situation for Turing 
reducibility (which is certainly the most important one in ordinary recur- 
sion theory) is somewhat more complicated. 

Perhaps the most common approach now to Turing reducibility is in 
terms of an oracle. One views a computation relative to some function f 
(sets are considered in terms of their characteristic functions) as proceeding 
along as usual with the added proviso that at any point one may obtain 
from the oracle the value of f at a given number zn. In terms of the equation 
calculus we might first try to put all of f into the initial set of equations, for 
then at any step we have available the value of f(n) for each n. We would 
then say that g is computable from f if all its values can be consistently 
deduced from a finite set of equations plus the complete graph of f. In 
a-recursion theory this defines the reducibility called a-calculability. (We 
write g <..f (A S..B) to mean that g(A) is a-calculable from f(B).) 

Although a-calculability is an important and useful reducibility (for 
countable a@ it corresponds to KREISEL’s [1965] model-theoretically defined 
notion of implicit invariant definability), we do not consider it the 
appropriate recursion-theoretic one. Its principal conceptual shortcomings 
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are that a single computation of a value of g may use a-infinitely much 
information about f and may take more than @ many steps to complete. 
Both of these possibilities seriously conflict with our intuitive ideas from 
ordinary recursion theory about how relative computations should pro- 
ceed. 

To see how to overcome these difficulties we turn to another view of 
relative recursiveness in ordinary recursion theory which incorporates 
these intuitions. The idea is that we should be able to specify the 
computation procedure effectively along with the questions it asks of the 
oracle in any given computation. That is, the description of the procedure 
as a whole is independent of the particular answers given along the way. 
One can picture the procedure as being given by an effective tree structure, 
branching at the various questions. The information used on any terminat- 
ing path is then the finite amount needed for that computation. We 
recommend Rocers [1967] pp. 128-132, who takes this approach as basic 
in ordinary recursion theory, for further discussion. We think of the 
computation procedure as being coded by an r.e. set. This set enumerates 
terms corresponding to the outputs of terminating branches paired with the 
finite set of answers needed to produce those outputs. This view leads us to 
a reasonable formal definition: A is a-recursive in B, written A =. B, iff 
there is an a-r.e. set W. such that for each a-finite set K we have that 


KCA © 3(K,1,M,N)E W.(MCB&NCB), 
KCA © 3(K,0,M,N)E W.(MCB&NCB) 


where M and N range over a-finite sets and we employ some effective 
coding scheme for such sets and quadruples. Note that, in line with our 
generalizing the concept of finiteness, both our inputs (M, N) and outputs 
(K CA or K CA) are required to be a-finite. (Historically an earlier 
attempt at a definition only asked for single (or equivalently truly finite) 
outputs. This is now called weak a-recursiveness — A =,,,B. Not only 
does this fail to capture the true spirit of the generalization but it turns out 
(Drisco.t [1968]) to be technically unworkable as a reducibility — it is not 
transitive.) 

Associated with this reducibility (S.) we naturally have a notion of 
a-degree (the equivalence classes under =, are generally denoted by bold- 
face letters a,b,c,...). The degrees are given the structure of an upper- 
semilattice by the ordering induced by =, and the join operator v defined 
as in ordinary recursion theory. As we said before, the investigation of the 
structure of this semilattice has been one of the main areas of research in 
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a-recursion theory (mostly by Sacks and his students). The most successful 
work in terms of the methodology of priority arguments as well as actual 
structural results has been on the a-degrees of a-r.e. sets (called the 
a-recursively enumerable degrees). Although much work had previously 
been done for certain ordinals (especially w{*), the first real success of the 
priority method for all admissibles came when Sacks and Simpson [1972] 
established the analog of the Friedberg-Muchnik solution to Post’s 
problem: there are incomparable a-r.e. degrees, that is, degrees a and b 
such that a #.b and b #,a. They achieved more, however, than merely 
solving this problem. Their methods were sufficient to handle all types of 
simple finite injury priority arguments for all admissible ordinals. 

Since then considerable progress has been made with more difficult 
priority arguments. The methods developed in SHORE [1975a] to prove the 
splitting theorem (for every non-a@-recursive @-r.e. ¢ there are a-r.e. a and 
b such that as.c, b=,c and avb=c) seem adequate for priority 
arguments with unbounded preservations (see Section 3 below for details). 
The status of infinite injury arguments in a-recursion theory is somewhat 
less clear at the moment. The prime example from ordinary recursion 
theory has been successfully generalized — the a-r.e. degrees are dense 
(SHorE [1976a]) — but not all the theorems investigated have worked out 
so well. Thus for example LERMAN and Sacxs [1972] showed that there is a 
minimal pair of a-r.e. degrees (i.e. non-a@-recursive a-r.e. a and b such 
that if ¢ =,a and c <,b then c is the degree of the a-recursive sets) for 
most but not all admissible a. (A slight improvement is given in SHORE 
{1975b] but the most difficult case remains open.) 

Along these lines the main open question seems to be to find a first order 
difference between the theory of the r.e. degrees and that of the a-r.e. 
degrees for some a. Indeed this question is open even for the theory of all 
a-degrees. The only answer (to either question) currently available uses 
the jump operator: thus for example if a = Ni and A =, 0’, then A’=, 0’ 
(SHoRE [1976b]), while in ordinary recursion theory there are even r.e. sets 
A 0’ with A’=0” (Sacks [1966a], §7). Of course the real point is to find a 
difference in the theories with only < in the language.' We must admit, 
however, that very little is known at all about the structure of the 
a-degrees as a whole. Even the first question one might ask has not been 
completely answered: minimal a-degrees are known to exist for all 
countable a (MACINTYRE [1974]) and all 2, admissibles (SHORE [1972b]) but 
not in general for every a. The prime example of the unknown cases is 
a = NE. 

' (Added in proof.) Such a difference has now been found, see SHORE [1976c]. 
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The other major area of current research has been the structure of the 
lattice of a-r.e. sets, (a) (mainly by Lerman and his students). In addition 
to illuminating the structure of the @-r.e. sets by answering specific natural 
questions about the lattice (e.g. are there maximal elements), the overall 
goal of this work has been to settle the question of the decidability of these 
lattices. Although this is a major open problem even in ordinary recursion 
theory, it is hoped that one will be able to attack it piecemeal for various 
a’s which have especially simple lattice structures. 

To be more precise, as in ordinary recursion theory one’s attention really 
centers on the lattice @*(a) of a@-r.e. sets modulo ‘“‘finite’’ sets. What 
should be a fruitful analog for finiteness in this setting has itself been a 
matter for some investigation as in LERMAN [1976b, 1976c]. For lattice- 
theoretic purposes the best choice seems to be what Lerman calls a *-finite 
(the set and all its a-r.e. subsets are a@-finite). This notion is the only one 
giving an ideal of generalized finite sets which is definable over @(a). 
Moreover the theory of (a) is equidecidable with that of the quotient 
lattice €*(a@) (LERMAN [1976b]). 

In line with the general goal of attacking the decidability of &*(a) for 
various a, work has been directed not only at proving general theorems for 
all admissibles but also at analyzing it in depth for certain promising ones 
such as Ny and N. The best result of the first sort is LERMAN’s [1974a] 
characterization of those admissibles which have maximal elements in 
€*(a). Essentially he shows that this happens if and only if @ is countable 
via a function with a certain type of recursive approximation. (The 
necessity of some countability assumption was first shown in LERMAN and 
Simpson [1973].) For examples of the second type of result we here cite 
work by CHONG and LERMAN [1975] on hyperhypersimple sets (ones whose 
supersets in *(a) form a Boolean algebra) and of LERMAN [1976a] and 
Leccetr and SHoreE [1976] on the possible 1-types of simple sets (ones 
whose complement, although a* infinite, contain no a* infinite a@-r.e. 
subset). Needless to say much work remains to be done in this area.’ 


3. An example 


Work in a-recursion theory has shed some light on the general nature of 
priority arguments and generated applications to both recursion in higher 
type objects (HARRINGTON [1973]) and axiomatic recursion theory (SIMPSON 
[1974b]). It has also interacted with set-theoretic work in a strong way. 
Perhaps the best way to explain both these relationships and the workings 


' (Added in proof.) See LERMAN [1977] for some major progress. 
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of a-recursion theory itself is by an example. We will present a slightly 
modified proof of the Splitting Theorem from SHoreE [1975a]. We first 
sketch the proof of this theorem in ordinary recursion theory (see SAcKs 
[1966a], §5, or SHOENFIELD [1971], §14). 


THEOREM. Let C be an r.e. member of a given non-recursive r.e. degree c 
enumerated by c. We wish to construct r.e. sets A and B such that 
AUB=C,ANB=6, A srC, B¥rC, C#rA and C £rB. 


(Notice that A U B = C implies that A v B =;C and so we have split the 
degree of C as required.) The basic plan is, at each stage o of the 
construction, to put c(o) into precisely one of A and B (the positive 
requirements). This immediately insures that AUB=C, ANB=@, 
A =,C and B =;C (to see if x € A, say, ask if x © C and if so wait until it 
is enumerated and check whether or not it is put into A). Thus we only 
have to take steps to make sure that we cannot compute C from A or B. 
Our (somewhat roundabout) approach here (due to Sacks) is that for each e 
we attempt (with priority e) to preserve computations of {e}* (and 
similarly {e}") on initial segments as long as they seem to agree with C (i.e. 
its characteristic function). The idea is that once e has highest priority (as 
explained below) we preserve the first available computation of {e}* (x) for 
each x as long as we have no disagreement with C. If {e}* = C we would 
thus preserve such computations for every x. Of course {e}* would then be 
recursive. (Just wait until we preserve a computation {e}*(x). As e has 
highest priority it will never be destroyed and so will give the correct value 
of {e}*.) We then would contradict the non-recursiveness of C. 

To be precise, the construction and the simultaneous definition of 
terminology proceeds as follows: We let A’ and C° denote the elements 
enumerated in A and C before stage a respectively. We use the notation 
{e}4°(x) = C7(x) to mean that there is a computation (the appropriate 
4-tuple) less than o which shows that computing via procedure e relative to 
A° at x gives the characteristic function of C® at x. At stage o we find, for 
each e-active x <o, the least computation of {e}3°(x) = C’(x) (if one 
exists) and create a negative e-requirement for A with argument x consisting 
of the elements assumed to be outside of A in this computation (where x is 
the least number y for which we now have no negative e-requirement with 
argument y). If at any later stage we put an element of this requirement 
into A we say that we have destroyed it. We say that a reduction procedure 
e<o is e-active at stage o unless we have a negative e-requirement (as 
yet undestroyed) with argument y such that {e}3°(x)# C7(y) (note that 
this can only happen if y has been enumerated in C since the requirement 
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was created) in which case it is inactive. Of course we do all of this for B as 
well. 

Finally we put c(a) into A or B so as to preserve as much as possible. 
That is, we let e4 (es) be the least number such that c(o) belongs to a 
negative e-requirement for A (B) (say —1 if there is no such number). If 
€a = és we put c(c) into B; otherwise it goes into A. Thus we have given e 
requirements for A priority over e’ ones for B iff e =e’. 

To see that the construction succeeds (i.e. C#,A and C £,B) one first 
argues by induction on e that only finitely many negative e-requirements 
are ever created. 

Suppose, for the sake of induction, that by stage o> we have created all 
negative e’-requirements for e’ < e that will ever be created. Together they 
all form a finite set, say N. By some stage o, = o> all elements of CON 
will have been enumerated in C. The priority ordering now guarantees that 
any e-requirement for A created after stage a, is never destroyed. (We call 
such requirements permanent and others temporary.) Note first that if at 
any stage o > @; € is inactive, it is inactive forever after, as the associated 
requirement is permanent. Thus, by the rules of the construction, no 
further e-requirements for A can ever be created. If, however, e is never 
inactive at a stage o > a), then by definition every e requirement for A is 
associated with a computation {e}* (x) giving the correct value of C(x). As 
such requirements are created for initial segments of x’s there can be 
infinitely many of them only if we create one for each x. Were this the case, 
however, we could compute C(x) by just looking at the value of the 
first e-requirement for A with argument x created after stage a. As this 
would contradict the non-recursiveness of C, we conclude that there are 
only finitely many e-requirements ever created for A. We then argue 
similarly for B. 

It is now fairly easy to see that C #7 A. Suppose that {e}* = C and let o; 
be as in the above argument. e can never be inactive at any stage o > o; for 
then {e}*° (x) 4¥ C’(x) (with x EC in fact) and the permanence of the 
requirement implies that this computation from A is correct contradicting 
{e}“ = C. Thus nothing prevents us from creating e-requirements for A 
with argument x for each x when the correct computation of {e}*(x)= 
C(x) actually turns up. Of course, this contradicts the above result that all 
these requirements are finite and so C 4,A. Of course C #7B by the 
same argument. 

Let us now try to follow the same procedure in a-recursion theory. The 
only difficulties appear in the inductive proof that there are only a-finitely 
many e-requirements for each e. The major difficulty is one endemic in 
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a-recursion theory. Even if we have proven this for each e’< e we do not 
know that the set N of all such requirements is a-finite. We cannot 
therefore assume the existence of a stage a» as required. The point is that a 
union of fewer than a many a-finite sets need not be a-finite. For example 
if a =X, nothing prevents the formation of e-requirements for every 
x <<. for each e <w. Thus we would have a many e'-requirements for 
e'<w and no stage like a would exist. (Note that if N is a-finite a> does 
exist since the map, taking an element of N to the stage at which it is 
created, is a-recursive and so bounded on a-finite sets by admissibility.) 

The only other apparent problem is that even if N is a-finite CNN 
need not be and so there could be no stage a; as in the original argument. 
(Again note that if CQ N is a-finite it is enumerated in a-finitely many 
steps by admissibility.) Although this too is a common problem in 
a-recursion theory, SAcxs [1966c] has supplied an easy remedy. We call a 
set C regular if CON is a-finite for every a-finite N. What Sacks [1966c] 
tells us is that every a-r.e. degree contains a regular a-r.e. set. Thus with 
no loss of generality we may assume that C is regular and so eliminate the 
second problem. 

The key to solving the first problem comes from SHORE [1975a, 1976a]. 
The idea is to make the list of negative requirements so short that the 
problem can never arise. For example if a = N,, we would like to limit 
ourselves to an-w-list. In general the bound on the creation of e- 
requirements (if it exists) is given by a 2 function. Thus we want to arrange 
the priority ordering e of reduction procedures in a list of length y such 
that there is no 2, map on any 6 < y which is unbounded in a. The greatest 
such y (and so the most likely choice) is called the 22 cofinality of a 
(a2cf(a)). To accomplish this we block the reduction procedures into 
y = a2cf(a) many pieces {B.}.<,, each a proper initial segment of the usual 
list of reduction procedures (thus for a = N,, we would have B, =X. for 
e < w). The idea is, for each x, to accept computations of {5}*" (x) = C?(x) 
from any 6 € B,. We should also note that we can generate these blocks via 
a recursive approximation which will be correct on each initial segment of y 
from some point on. (We use the usual approximation from ordinary 
recursion theory for total %. (and so A,) functions. The admissiblility of a 
makes the approximation converge pointwise and the convergence on 
initial segments is assured by y’s being the >, cofinality of a.) 

All that really remains to be checked is that this blocking does not 
interfere with computing C if there are a-infinitely many e-requirements 
for some e < y. This, however, presents one more important problem that 
we must deal with before we can precisely formulate our construction. 
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Consider some B, and assume that as in the ordinary proof we have a stage 
a, such that no e-requirement for A is ever destroyed after o,. In order to 
compute C, given that there are a-infinitely many e-requirements, we must 
eliminate those 6 € B, which are inactive at some stage after a, (and so 
forever). These will give incorrect answers, i.e. {5}* (x) # C(x) for some x, 
while all the others give only correct values of C. Thus if we could bound, 
say by a», the stages at which 6’s in B. become inactive, we could compute 
C as before by finding for each x the first computation of {5}3°(x) = C°(x) 
for any 0 > o>. 

Once again if the set W of 5 € B, that becomes inactive were a-finite 
the desired bound would exist by admissibility. The problem however is 
that W is only a-r.e. and there can be a-r.e. subsets of a-finite sets (B, 
here) which are not a-finite. (This of course was the cause of our first 
problem — CN might have been a-r.e. but not a@-finite.) The solution 
here too is to make the listing of reduction procedures (as distinct from the 
priority ordering of blocks of procedures) so short that no bounded a-r.e. 
subset can be a-finite. Thus we need a short listing of the ordinals less than 
a but we must also be able to generate it in a recursive way. The key notion 
here is that of the projectum, a*, of a: a* is the least B such that there is 
an a@-recursive f mapping a one-one into f. The reason that this idea 
embodies the solution to our problem is the following: 


THEOREM. Any a-r.e. subset of any 5< a* is a-finite. 


(The proof is easy from a recursion theoretic viewpoint: the usual recursive 
enumeration without repetition of an a-infinite a-r.e. subset of 6 would 
give a map from a@ into 6.) Thus we need only use an a® list of our 
reduction procedures to make W a-finite and solve our last problem. 

All that remains is to combine our solutions — we must divide up a* 
into y many blocks. Let f: a—a* be a one-one @-recursive projection 
and let h : y > a@ be a 22 map unbounded in a. Our blocks are then given 
by B. = fh(e) for e<y. Note that by admissibility the range of f ' is 
bounded on any proper initial segment of a*. Thus h”y projects to an 
unbounded sequence in a@* and so we include every reduction procedure in 
these blocks. Moreover, by our previous remarks we can a-recursively 
approximate these blocks, say by B.(a), so that for each e, B, (a) is equal 
to B, from some point on. We can also guess at f~'(5) for any 6 < a@* in an 
a@-recursive way by the usual procedure (e.g. compute f(7) for 7 < o and 
see if you get 5). Combining the various approximations we can now 
precisely describe our construction with notational conventions as before. 
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At stage o we find for each e < y the least computation of {f7'5}4°(x) = 
C°’(x) for some e-active 6 € B.(o) (x is the least y which is not the 
argument of some negative e-requirement). We then create a negative 
e-requirement and proceed with the rest of the construction exactly as 
above. 

The verification that the construction succeeds proceeds as in ordinary 
recursion theory with the new difficulties handled as we have described. By 
induction there are only a-finitely many e’-requirements for e’<e; thus 
there is a bound on their creation given by a 22 function. As e < y they 
form a single a-finite set N and we have a bound gy» on the stages at which 
they are created. By the regularity of C we have a stage a, = a» by which 
all elements of N have been enumerated in C. We can also take a, to be 
large enough so that B.(o) and f~'(5) for 6 € B, have all reached their 
final values. Once again e-requirements now created are permanent. As 
B.<a* the 6€ B, that become inactive form an a-finite set and so 
contribute only boundedly much. Thus if there were a-infinitely many 
e-requirements for A we could again compute C a-recursively for our 
contradiction. Thus there are only a-finitely many e-requirements for A. 
The B part of the argument is the same and the induction continues. 

Finally to see that C¥%, A (or C ¥. B) we suppose that C<,A with e’ 
giving the reduction procedure as in the definition of =,. Let f(e’) = 5 € B, 
for some e. We can now argue (beginning at stage o,) that there must be 
a-infinitely many e-requirements created. By choice of 6 there are correct 
computations of {f~'(5)}*(x)= C(x) for every x, so some requirement 
must be created for each x. As this contradicts our last result, we have that 
C#.A (and similarly C #, B) as required. 

This completes the proof of the Splitting Theorem. (For more details see 
SHORE [1975a].) We will now see how the ideas used to overcome the 
special difficulties in a-recursion theory interact with some other areas of 
research. 


4. Interactions and applications 


We begin with the notion of projectum which played a key role in our 
proof and which is closely connected with some of the important set- 
theoretic concerns about the fine structure of L. An obvious question that 
arises if one is studying the constructible hierarchy is just when sets are 
constructed. Thus for example all (constructible) subsets of w are con- 
structed by level N, (of L) but one wants to know exactly at which levels 
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they occur. This particular question was answered by BooLos and PUTNAM 
[1968] (who incidentally were working on degree hierarchies in ordinary 
recursion theory). The results are generalized in Kripke [1964a] and 
PLATEK [1966], while the best (and most general) results are in JENSEN 
[1972]: 


THEOREM. There is a subset of 6 in Las:—L. given by a definition which is 
2, (A,) over L, iff there is a function f : a — 8 which is one-one and >, over 
L. (and onto 8). 


For the example of 5 = w this says that a new subset of w is constructed at 
level a just in case everything becomes countable at that level. 

If we consider admissible a with n=1 and apply the appropriate 
translation, we see that this is precisely the key fact about the projectum 
a* of a: If one can’t map @ one-one into 6 by an a-recursive function 
(= total 2, = A,) then there is no new @-r.e. (= 21) subset of 6 (so they are 
all q@-finite). The general result of Jensen plays a key role in both 
set-theoretic and later recursion-theoretic results. 

For work in a-recursion theory it is often necessary to use more 
complicated projections than just &, so as to get even shorter listings. As 
examples we cite the minimal pair constructions of LERMAN and SAcKs 
[1972] and SHorE [1975b] (as well as LeGGEtT and SHorE [1976]) in which 
the 2, projectum is used to list the reduction procedures and requirements. 
In another vein a relativized version of the theorem for a-r.e. sets and 
n = 1is used in SHoRE [1976a] to show that the a@-r.e. degrees are dense. Of 
course such higher-level projections can be introduced into a-recursive 
constructions only at the expense of complicating the approximation 
procedures and weakening the convergence properties needed in the 
proofs. 

The main set-theoretical applications of projecta followed from Jensen’s 
remarkable uniformization theorem: 


THEOREM (JENSEN [1972]). Every %, relation over L, can be uniformized by 
a function %, over L,. 


Although this result obviously has a recursion-theoretic flavor, it was a key 
ingredient of many of Jensen’s purely set-theoretical results. As a prime 
example we cite the results of JENsEN [1972] that Souslin’s hypothesis fails 
in L. (There are many other results related by their methods of both 
combinatorial and model-theoretic character.) We should also note that 
Jensen’s proofs of these results have an additional recursion-theoretic 
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touch. They depend heavily on the idea of uniformity. It is by proceeding 
always in a prescribed uniform way that the construction is carried past 
limit stages. 

As a final application of these notions we cite the recent work of 
JockuscH and Simpson [1976]. Here the idea of generalized projecta and 
the related one of coding L, as a subset of the projectum came back to the 
source of its original motivation — degree hierarchies in ordinary recursion 
theory. They are used (together with Sacks’ perfect set forcing) to prove 
results on definability in the theory of Turing degrees with jump. For 
example, it is shown that the ramified analytic hierarchy is definable level 
by level in the theory of degrees with jump. 

The other major innovation needed for the proof in Section 3 (the 
blocking technique) has had some repercussions for axiomatic recursion 
theory. A key question for such an approach is what is needed to do 
priority arguments, as these are perhaps the deepest and most characteris- 
tic methods of ordinary recursion theory. In a typical approach to an 
axiomatic recursion theory (as in MoscHovakis [1971]), the main structural 
loss (relative to a-recursion theory) is the recursive wellordering of the 
universe. In its place one has only some nice sort of prewellordering. 
Although this does not seem to be quite enough to do priority arguments 
(see Simpson [1974b] who, however, needs the axiom of determinateness 
for his counterexample), the blocking technique does supply some answers. 
One can indeed block all elements of a single level of the prewellordering 
(assuming they are ‘‘finite’’), but one still seems to need a notion of 
projection to argue that each block settles down. Thus one can give an 
extended axiomatization which allows one to employ this technique to 
carry out many priority arguments. (This, too, is pointed out in SIMPSON 
[1974b].) 

Finally we would like to mention two applications of the overall results 
and methods of priority arguments in a-recursion theory. The first is to 
enable one to do certain forcing constructions over uncountable structures. 
This idea is used for example in SHORE [1974b] where a priority argument is 
done on a forcing construction. A special case of the results there shows 
that over every model of set theory with, say, 2, global choice there are 2, 
classes A,B such that neither is A, in the other. Another example of 
mixing a forcing construction with a-recursion theory appears in SIMPSON 
[1974a]. There it is used to prove an analog of Friedberg’s theorem on the 
jump operator for a-degrees. 

The last application of these methods we consider is to recursion in 
higher type objects. HARRINGTON [1973] has given a method for directly 
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translating certain results of a-recursion theory into ones about higher 
type objects. The only added proviso is that the constructions in a@- 
recursion theory must be uniform in the sense that no infinite parameters 
are used to describe it. Although we did not bother to do this here (we used 
both y and a*), it can be done. (See SHORE [1975a] and Maass [1977c] for 
the Splitting Theorem and SHoreE [1974b] for the Friedberg—Muchnick 
solution to Post’s problem.) Thus we ‘‘automatically” have analogs for the 
priority argument constructions from a@-recursion theory in the theory of 
higher type objects. 


5. Bibliographic guide 


We have tried to give a complete listing of papers devoted to a-recursion 
theory with very brief summaries of their contents. As a rule we list 
abstracts and summarize theses only when we do not know of the 
material’s having appeared elsewhere. A fair number of papers dealing 
with a@-recursion theory only peripherally or with its connections with 
other subjects have also been listed but not summarized. 

For the reader interested in pursuing a-recursion theory we recommend 
Sacks [1978] (when available) as a starting point. Until then, Stmpson 
[1974a] and SHorE [1975a] are reasonable first papers to read. For more 
information on and other approaches to priority arguments and the a@-r.e. 
degrees we suggest Sacks and Simpson [1972], LERMAN [1972] and (for the 
committed) SHorE [1976a]. For the lattice of a-r.e. sets one might begin 
with LERMAN and Simpson [1973], followed by LERMAN [1974a] on maximal 
sets and LERMAN [1975b, 1976] on general lattice-theoretic problems. 

There are also a number of important topics that we have not mentioned 
at all here. The jump operator is considered in Smpson [1974a] and in 
SHorE [1976b]. The phenomena of non-regularity and non-hyperregularity 
are investigated in Simpson [1971] and the structure of such degrees is 
analyzed in SHorE [1975b]. 

Turning to a wider ranging view we mention some starting points for 
work in other areas related to a-recursion theory. For model theory and a 
general view of admissibility see Chapter A.7 and BaRwisE [1975]. Histori- 
cally more relevant papers include Barwise [1969] and Karp [1964]. For 
inductive definitions the basic reference is MoscHovakis [1974] with the 
relation to a-recursion theory coming in Chapter 9 which gives the results 
of BarwisE, GANDY and Moscuovakis [1971]. For the more intimate 
connections with admissible ordinals see CENZER [1974]. Axiomatic recur- 
sion theory is represented by MoscHovakis [1971] and FRIEDMAN [1971]. 
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Simpson [1974] is also relevant here. Applications of a-recursion theory to 
higher type objects are in HARRINGTON [1973] and in LowenTHAL [1974]. 
The fine structure of L is best set forth in Chapter B.5 or in JENSEN [1972]. 
Earlier papers in this area connected with recursion theory include JENSEN 
and Karp [1971] and Ganpy [1974]. 
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tame 2, projection to give new version of finite injury arguments. 


[1973] Admissible ordinals and priority arguments, in: MATHIAS and Rocers {1973] pp. 
311-344. 


Gives heuristic description of construction of a minimal pair of r.e. degrees in ordinary and 
a-recursion theory. Introduces ‘‘pinball’”’ metaphor. 


[1974a] Maximal a-r.e. sets, Trans. Am. Math. Soc., 188, 341-386. 


Considers various possible definitions of maximal a-r.e. set. Proves they exist (for 
reasonable definitions) if s3p(a) = w. Introduces this new (s3) projection given by recursive 
approximation with two extra variables. 


(1974b} Least upper bounds for pairs of a-r.e. a-degrees, J. Symbolic Logic, 39, 49-56. 
No minimal pair of a-r.e. degrees can have join 0’. 
[1976a] Types of simple a-recursively enumerable sets, J. Symbolic Logic, 41, 419-426. 


There are distinct 1-types in (a) for simple sets for many a. Uses major subset (every 
a-r.e. set has one) and hyperhypersimplicity. See also LEGGETT and SHORE [1976]. 


[1976b] Ideals of generalized finite sets in the lattice of a-recursively enumerable sets, to 
appear. 


3! ideal of generalized finite sets definable over (a). For these sets (a) and €*(q) are 
equidecidable. 
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[1976c] Congruence relations, filters, ideals and definability in lattices of a-recursively 
enumerable sets, J. Symbolic Logic, 41, 405-418. 


3 largest ideal, filter and equivalence relation definable in €(a). Classifies all ideals for 
some a and leaves only two possibilities for others. In any case there are only finitely many. 
Points out that main result of MACHTEY [1971] works only for sentences not V3. 


LERMAN, M. and G.E. SACKS 
[1972] Some minimal pairs of a-recursively enumerable degrees, Ann. Math. Logic, 4, 
415-442. 


If greatest cardinal of L, = o2p(a)< to2p(a) =a, then there is a minimal pair of a-r.e. 
degrees. See also SHORE (1976a]. 


LERMAN, M. and S.G. SIMPSON 
{1973] Maximal sets in a-recursion theory, Israel J. Math., 4, 236-247. 


Some necessary and some sufficient conditions for the existence of maximal a-r.e. sets 
especially the necessity of countability. See also LERMAN [1974a]. Scattered results on 
r-maximal sets as well. 


Levy, A. 
(1963] Transfinite computability, Notices Am. Math. Soc., 10, 286. 


Turing machine definition of recursion on regular cardinals equivalent to TAKEUTI [1960] 
approach and one similar to MACHOVER [1962]. 


LOWENTHAL, F.D. 
[1973] Some results on measure and category in a-recursion theory, Notices Am. Math. 


Soc., 20, A-450. 


For countable a: sets above (in =, or <.a) a given non-a-recursive one have measure 0 
and are of 1-st category. Set of all joins of two degrees has measure 1. 


[1974] The minimal pair problem for higher type objects, Thesis, M.I.T., Cambridge, MA. 


Considers applying methods of HARRINGTON [1973] to minimal pair argument of LERMAN 
and Sacks [1972] to give results on higher type objects. 


MACHOVER, M. 
[1961] The theory of transfinite recursion, Bull. Am. Math. Soc., 67, 575-578. 


Introduces an equation calculus for recursion theory on regular cardinals using a sup 
operator and functions with infinitely many variables. Under assumption that V = L recur- 
sively Gédel numbers process and notes analogs of basic recursion theoretic facts. Gives a 
typical application to infinitary languages. For successor cardinals a the set of satisfiable 
formulas of L,,.. is not a-arithmetical. 


(1962] The theory of transfinite recursion (in Hebrew), Thesis, Hebrew University of 
Jerusalem, Jerusalem. 
MACcHTEY, M. 
[1969] Intrinsic consistency results and lattices of recursively enumerable sets in abstract 
recursion theory, Thesis, M.I.T., Cambridge, MA. 
[1970] Admissible ordinals and intrinsic consistency, J. Symbolic Logic, 35, 389-400. 


One cannot in general find an intrinsically consistent (i.e. applicable to all sets) reduction 
procedure that can replace a given one if a is not a cardinal. One can if @ is a cardinal and 
V=L. 


(1971] Admissible ordinals and lattices of a-r.e. sets, Ann. Math. Logic, 2, 379-417. 
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Claims Lachlan’s decision procedure for V4 formulas of €* works for a when a* = w, but 
see LERMAN [1976c] for a counterexample at the second quantifier level. 


[1974] Minimal degrees in generalized recursion theory, Z. Math. Logik Grundlagen 
Math., 20, 133-148. 


Modifies MACINTYRE [1973] to show that 3 2% minimal a-degrees for countable a. Main 
result is that for a* = w there is a minimal a-degree below each a-r.e. degree. 


MACINTYRE, J.M. 
[1968] Contributions to metarecursion theory, Thesis, M.I.T., Cambridge, MA. 


In addition to next two items studies subsets of w generic over L.s« and structure of 
a-degrees below such a set. 


[1973] Minimal a@-recursion theoretic degrees, J. Symbolic Logic, 38, 18-28. 
There exist minimal a-degrees for countable a and regular cardinals of L. 
[1974] Non-initial segments of the a-degrees, J. Symbolic Logic, 38, 368-388. 


Va <N,5A Ca (every countable distributive lattice with 0 and 1 is isomorphic to an initial 
segment of a-degrees above A). Thus these theories of a-degrees are undecidable. 


MatTuHias, A.R.D. and H. ROGERS, editors 
{1973} Cambridge Summer School in Mathematical Logic, held in Cambridge/England, 
August 1-21, 1971 (Springer, Berlin). 
METAKIDES, G. 
[1972] a-degrees of a-theories, J. Symbolic Logic, 37, 667-682. 


For every w{*-r.e. X C w dw%*-theory, T(X), with bounded w‘*-r.e. axioms of same degree 
as X. VX CB < wi*AT(X) of same degree. Same proof works for all «. 


MoscHovakis, Y.N. 
(1971] Axioms for computation theories — first draft in: GANDY and YATEs [1971] pp. 
199-255. 
[1974] Elementary Induction on Abstract Structures (North-Holland, Amsterdam). 
Myers, D.L. 
[1970] Meta-arithmetical hierarchies, Thesis, M.I.T., Cambridge, MA. 


Considers relativization in recursion on w{* and examines ‘possible notions of jump and 
arithmetical hierarchy with partial results. 


OHASHI, K. 
[1970] On a question of G.E. Sacks, J. Symbolic Logic, 35, 46-50. 


Intrinsically consistent reduction procedures are not sufficient. See also MACHTEY [1970]. 


OwinGs, J.C., JR. 
[1966] Topics in metarecursion theory, Thesis, Cornell University, Ithaca, NY. 
[1967] Recursion, metarecursion and inclusion, J. Symbolic Logic, 32, 173-179. 


Set A is of type 1 if for every B maximal in A there is a maximal C such that ANC = B. 
Otherwise A is of type 2. Maximal w‘*-r.e. sets A with A unbounded are of type 1 with A 
bounded of type 2. Both exist. 


[1969] Il}-sets, w-sets and metacompleteness, J. Symbolic Logic, 34, 194-204. 


Every w**-degree of a non-regular w‘*-r.e. set or equivalently of a II|-2; subset of w 
contains such a set with complement of order type w. Every non-regular w‘*-r.e. set has some 
complete r.e. set weakly recursive in it so <,.,;* is intransitive on IT} sets. 
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(1970] The metarecursively enumerable sets but not the II} sets can be enumerated 
without repetitions, J. Symbolic Logic, 35, 223-229. 


Proves the theorem of title. 
[1971] A splitting theorem for simple IT; sets, J. Symbolic Logic, 36, 433-438. 


Any simple II; set (=simple w‘*-r.e. non-w*-recursive subset of w) can be split into 
disjoint IT} sets of strictly smaller w‘*-degrees. 


PLATEK, R. 
[1966] Foundations of recursion theory, Thesis, Stanford University, Stanford, CA. 


Develops a-recursion theory as a special case of generalized recursion theory. Approach is 
via definability by (primitive) recursion. Then adds on a search operator. Isolates correct 
primary notions (admissibility, a-finite, projectum etc.) and establishes basic recursion 
theoretic facts and the equivalences with the set theoretic aproach. 


Rocers, H., Jr. 
[1967] Theory of Recursive Functions and Effective Computability (McGraw-Hill, New 
York). 
Sacks, G.E. 
[1966a] Degrees of Unsolvability, Annals of Mathematics Studies, Vol. 55 (Princeton 
University Press, Princeton, NJ, 2nd ed.). 
[1966b] Metarecursively enumerable sets and admissible ordinals, Bull. Am. Math. Soc., 
72, 59-64. 
Describes meta ( = w{*) recursion in terms of IT{ = r.e. and also via equation calculus. 3 two 
II} sets which are w{*-incomparable. Announces results of Sacks [1966c]. 
[1966c] Post’s problem, admissible ordinals and regularity, Trans. Am. Math. Soc., 124, 
1-23. 


Every a-r.e. degree contains a regular a-r.e. set. Uses a weak priority argument to show 
that 3 (with respect to all reducibilities) an incomplete non-a-recursive a-r.e. set. Introduces 
notions of regularity and hyperregularity. 3 countable a with né maximal a-r.e. set. 


[1967] Metarecursion theory, in: CRossLey [1967] pp. 243-263. 


A review and exposition of KREISEL and SAcks [1965] and the basic results of recursion on 
w‘* but using equation calculus in place of notations from © and IT}. Also shows that every 
non-regular w{*-r.e. set is of same w{*-degree as a II} subset of w. Announces results of SACKS 
[1971]. 


[1971] On the reducibility of II{ sets, Advances in Mathematics, 7, 57-82. 


Uses w‘*-infinite, coinfinite forcing conditions to build II} sets which are w{*-subgeneric 
(= hyperregular) and incomparable with respect to arbitrary computation of less than w{* 
steps. Analogous results for a* = w. 


[1978] Higher Recursion Theory (Springer, Berlin), to appear. 


Introduction to @-recursion theory (Chapters V, VI). Includes special case of a = w{* and 
some priority arguments for all a. Contains most of the basic facts and definitions. 


Sacks, G.E. and S.G. Simpson 
[1972] The a-finite injury method, Ann. Math. Logic, 4, 323-367. 


First real priority argument for every a to show that 3 hyperregular a-incomparable a-r.e. 
sets. Argues inside a sequence of 2, substructures of L. for most difficult case. 
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ScotT, D., editor 
[1971] Axiomatic Set Theory, Proceedings of Symposia in Pure Mathematics, Vol. 13, Part 
1 (Am. Math. Soc., Providence, RI). (The UCLA 1967 Set Theory Conference.) 
SHOEMFIELD, J.R. 
{1971} Degrees of Unsolvability (North-Holland, Amsterdam). 
SHore, R.A. 
[1972a] Priority arguments in a-recursion theory, Thesis, M.I.T., Cambridge, MA. 
[1972b] Minimal a-degrees, Ann. Math. Logic, 4, 393-414. 


Ad minimal @ (and ca) degrees if a is 2, admissible using priority method of Sacks and 
Simpson [1972]. Also for some other ordinals. 


[1974a] Cohesive sets: countable and uncountable, Proc. Am. Math. Soc., 44, 442-445. 
Many uncountable @ have cohesive subsets. Which ones do is independent of ZFC. 
(1974b] &,, sets which are A,, incomparable (uniformly), J. Symbolic Logic, 39, 295-304. 


Mixes forcing and priority arguments to show that for each n there are indices k, 1 < w for 
2, sets which are A,, incomparable for every %,, admissible a. 


[1975a] Splitting an a-recursively enumerable set, Trans. Am. Math. Soc., 204, 65-78. 


Given non-a-recursive a-r.e. D and a regular a-r.e. C, JA, B (A UB=C,ANB=@, 
A,B =,C, D#,A and D#,B). Same theorem for =.. and usual corollaries for degrees. 
Introduces blocking technique for priority arguments. 


[1975b] Some more minimal pairs of a-r.e. degrees. Notices Am. Math. Soc., 22, A524-525. 


Settles one of two cases left open in LERMAN and Sacks [1972]: if o2cf(a) =a, then 3 
minimal pair of a-r.e. degrees. 


[1975c] The irregular and non-hyperregular a-r.e. degrees, Israel J. Math., 22, 28-41. 


Characterizes ordinals @ for which 3! @-r.e. degree of non-regular or non-hyperregular 
a-r.e. set. Proves a splitting theorem for such degrees otherwise. =... is intransitive on @-r.e. 
sets iff there is more than one non-hyperregular a@-r.e. degree. 


[{1976a] The recursively enumerable a-degrees are dense, Ann. Math. Logic, 9, 123-155. 


Extends ideas of SHORE [1975a] to prove density of a@-r.e. a- and ca-degrees. First infinite 
injury priority argument in @-recursion theory. 


[1976b] On the jump of the recursively enumerable a-degrees, Trans. Am. Math. Soc., 217, 
351-363. 


Considers various possible definitions for jump operator. Argues for one and shows that 
there is no incomplete a-r.e. A with A’ =0" if 3! non-hyperregular a-r.e. degree. There is 
such an A if o2cf(a) = a. The proof given, that if A is non-hyperregular, A‘ =, 0” really only 
shows that 0"<,, A’. 


Simpson, S.G. 
[1971] Admissible ordinals and recursion theory, Thesis, M.I.T., Cambridge, MA. 


In addition to material included in LERMAN and Simpson [1973] and Simpson [1974a], 
contains much information on non-hyperregular @-r.e. sets, simple subsets of a* and many 
interesting counterexamples. 


{1974a] Degree theory on admissible ordinals, in: FENSTAD and HiNMAN [1974] pp. 
165~194. 
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b>, 0' is regular iff it is the jump of a regular hyperregular degree. Gives necessary and 
sufficient conditions for the existence of a cone of regular degrees. 3 incomparable a-r.e. a 
and b such that (a v b)' =0'. Gives a survey of other results on a-degrees and considers 
connections with Jensen’s fine structure of L. 


[1974b] Post’s problem for admissible sets, in: FENSTAD and HINMAN [1974] pp. 437-441. 


Using projective determinacy gives an admissible set for which there are no incomparable 
r.e, sets. Notes that can use blocking technique to give positive answer if have an analog of a® 
(calls such sets ‘‘thin’’). 


STILLWELL, J. 
[1970] Reducibility in generalized recursion theory, Thesis, M.I.T., Cambridge, MA. 


Attempts to do priority arguments for all admissible a. The relaization that life was not as 
simple as it here appeared led to the actual solution in SAcks and Simpson [1972]. 


Sukonick, J. 
[1969] Lower bounds for pairs of metarecursively enumerable degrees, Thesis, M.LT., 
Cambridge, MA. 


There exists a minimal pair of w"-r.e. degrees. 


TAKAHASHI, M. 
[1968] Recursive functions of ordinal numbers and Levy’s hierarchy, Comment. Math. 
Univ. St. Paul, 17, 21-29. 


Recursive on ON equals A, if V=L. 


TAKEUTI, G. 
[1954] Construction of the set theory from the theory of ordinal numbers, J. Math. Soc. 
Japan, 6, 196-220. 
{1957] On the theory of ordinal numbers, J. Math. Soc. Japan, 9, 93-113. 
[1960] On the recursive functions of ordinal numbers, J. Math. Soc. Japan, 12, 119-128. 


Introduces recursion on the ordinals by generating the semi-recursive functions from basic 
ones via recursion and a minimum operator. Recursive ones are gotten when min is only 
applied when bounded or total. The main result is that cardinals (in particular singular 
cardinals) are closed under recursive functions (induction on complexity). Uses theory to show 
that axioms of TAKEUT! [1957] can be effectivized. Motivation for this is in TAKEUTI [1954]. 


[1965a} A formalization of the theory of ordinal numbers, J. Symbolic Logic, 30, 295-317. 


Results from a paper given at the Symposium of Foundations of Mathematics, Katada, 
Japan, in October 1962. Presents recursion on ordinals and gives an ‘‘effective” consistency 
proof for set theory by showing that there is a recursive predicate isomorphic to € on L. 


{1965b] Recursive functions and arithmetic functions of ordinal numbers, in: BAR-HILLEL 
[1965] pp. 179-196. 


Considers some axioms of infinity based on a notion of arithmetic inaccessibility. 


TusuE, T. 
[1964] On the partial recursive functions of ordinal numbers, J. Math. Soc. Japan, 16, 
1-31. 


Gives an equation calculus for recursion on regular cardinals using sup operator. Shows it 
equivalent to Takeuti’s systems and proves some basic facts like recursion and hierarchy 
theorems. 
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Supplement to bibliography* 


Cuona, C.T. 
[1977] %,-cofinalities in L,, to appear. 


Some results on the possible values of oncf(a) and its relation to Snp(a). 


FRIEDMAN, S. 
[1976] Recursion on inadmissible ordinals, Thesis, M.1.T., Cambridge, MA. 


The first work on recursion theory on inadmissible ordinals. Basic facts and definitions. 
Simple sets exist. A solution to Post's problem for many inadmissible B. 


Jacoss, B.E. 
[1975] a@-computational complexity, Thesis, Courant Institute of Mathematical Sciences, 
New York University. New York. 
[1977a] On generalized computational complexity, J. Symbolic Logic, to appear. 


Generalizes the basic notions of abstract complexity theory to a-recursion theory. Analogs 
of compression and gap theorems. 


{1977b] The a-union theorem and generalized primitive recursion, to appear. 


Proves analog of union theorem and considers two analogs of primitive recursion on 
admissible a. 


[1977c] a-naming and a-speedup theorems, to appear. 


Proves analogs of these theorems, 


LERMAN, M. 
[1977] On elementary theories of some lattices of a-recursively enumerable sets, to 
appear. 


Decides the V4 theory of @*(a@) for two different classes of ordinals. If o2cf(a)= 
to2p(a) = w and a* = a, the theory is that of *(w). If s3cf(a) = o3p(a) = @ (e.g. a regular 
cardinal of L), then a different decidable VA theory is given. 


Maass, W. 
[1977a] Inadmissibility, tame r.e. sets and the admissible collaps, to appear. 


Introduces the notion of admissible collaps and studies it and other items in inadmissible 
recursion theory. 


(1977b] On minimal pairs and minimal degrees in higher recursion theory, to appear. 


Constructs hyperregular minimal a@-r.e. pairs for many a. Applies methods of MAAss 
[1977a] to construct minimal degrees when a2cf(a) = a2p(a)<a. 


{1977c] The uniform regular set theorem in @-recursion theory, to appear. 


Settles a question from Sacks [1966c] by giving a uniform construction of a regular a-r.e. 
set of the same a-degree as any given a-r.e. set. 


SHorE, R.A. 
[1976c] Combining the density and splitting theorems for a-r.e. degrees, Notices Am. 
Math. Soc., 23, A-598. 


If A'=,0' and A<,D are a-re.. then 3B,Ca-re. with A<,B, C<,D and 
Bv C=, D. By SHorE (1976b] there are @ (e.g. NL) for which A’ =, 0’ for every incomplete 
a-r.e. A. Thus by a result of Lachlan the elementary theories of the a-r.e. degrees with =, 
are different for a = w and a = Ni, 


* Added in proof. 
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Introduction 


Recursion in higher types was introduced by KLEENE [1959, 1963] and 
was soon recognized as a deep and significant extension of the theory of 
recursive functions on the integers. Our purpose here is to give an 
exposition of the basic notions and facts of this theory in a manner which 
will make them accessible to the mathematician with a working knowledge 
of ordinary recursion theory. 

From its inception, higher-type recursion has been considered difficult 
and somewhat esoteric. Although it has been brought to a seasoned 
maturity with the contributions of many researchers in the last fifteen 
years, it has not been understood by as wide a circle of mathematicians as it 
deserves. This is partly due to the technical difficulty of the basic papers on 
the subject. More than that, the basic notions of the theory have been 
considered difficult to understand and foundationally problematical. 

Here we will develop higher-type recursion in the context of the general 
theory of inductive definability. KLEENE [1963] himself saw this possibility 
and discussed it at the end of Section 10. Later, PLATEK [1966] gave a very 
satisfactory foundation for the subject within a theory of induction. His 
work, however, was also technically difficult and not easily applicable to 
Kleene recursion. 

The present approach is due to Moschovakis. A discussion of how it fits 
within a general theory of induction is given in MoscHovakis [1976], but 
knowledge of that paper is not needed to read this one. 

We should emphasize that this is an exposition of the elementary parts of 
the theory of higher-type recursion. In the last section we will make some 
suggestions on what to read next, for the reader who wants to become an 
expert. 


FUNCTIONAL INDUCTION 


1. Monotone operators on partial functions 


Fix an infinite set A such that w = {0,1,2,...}C A. An n-ary partial 
function (on A, into w) is any mapping from a subset of A” into w. We 
collect these into a set, 


PF,(A) = PF, ={f: f is an n-ary partial function}. 


cu. C.6, §2] SUITABLE CLASSES OF FUNCTIONALS 683 
An operator 9: P¥%, > AF, is monotone if for all f, g © AF,, 
fog > O(f)C O(g). 


Each such monotone operator determines a transfinite sequence {f§} of 
partial functions by the recursion 


fo= O(fe'), 
where f6'= U<ef3, re: 
fs(%) = w © for some 7 < €, fa(X) = w. 


In particular, f% = O(M), where @ is the empty n-ary partial function. 
A simple induction shows that 
=> fecfe, 
so by a cardinality argument there is a least ordinal « such that 
fo= fet = Urfe. 
We call f% the partial function inductively defined by © and we denote it by 
fa= Us fs. 


It is easy to check that f6@ is also the least fixed point of O, i.e., 


O(fe)=fe and O(f)Cf> feCf 


2. Suitable classes of functionals 


2.1. A functional (on A with values in w) is a partial mapping 
DB: A" X PIX +++ K PF, Pw 


from a Cartesian product of copies of A and some of the spaces PF, into 
w, which is monotone, i.e. 


AGB, cchi Chas 
@O(K, fis... fm) = w > BK, 21,..., 8m) = W. 


We allow here that n or m may be 0 — in particular all partial functions 
are functionals. 
The signature of the functional 


@(i, f)= P(xr+ + Xn fie et fm) 
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is (n,k,---k,) if f, is k;-ary. There is a natural 1-1 correspondence 
between functionals of signature (n,k,---k,,) and monotone operators 
O : PIA, X-++X PI,,, > PF,, which associates with @(x,f,---f,) the 
mapping 

OUfi-** fn) = AEP(E fr fn) 
and to each O(f,---f,) the functional 

P(X, firs + fm) = Offi > + fn) (¥). 
In particular, if ® has signature (n, n), then the associated @ is a monotone 
operator on ¥,. It will be convenient to abbreviate the &-th iterate of O 
by ®*, so that 

P*(<) = O(%, B<*) 
with @<* = U,..®”. We also let 6° = U, @! be the least fixed point of 
(the operator associated with) ® or the partial function inductively defined 
by @. 
This definition relativizes directly to any finite sequence of partial 

functions: given ®(¥, f, g) of signature (n,n,k,,...,k,), put 


PS (x, Z) = D(x, AX'H~*(X', Z), Z), 
O°(X,Z)=w & AEP‘ (¥%, Z) = w, 
where naturally 
b<§(%,Z)=w O An < E@"(K, 2) = w. 


We call such functionals ®(x, f, g) of signature (n,n, ki,..., km) operative 
and we call ©*(x, 2) the functional inductively defined by ®. It has the 
following minimality property: if W(x, g) satisfies 


AKD(X, AX’ W(X, Z), FZ) C AK (X, B) 
for every g, then ®°C WV. 


2.2. Let # be a class of functionals on A. The #-fixed points are all the 
functionals ®*(x, g) which are inductively defined by operative functionals 
(x, f,g) in ¥. We say that W(x, g) is ¥-recursive if there is an #-fixed 
point ®"(u, x, g) and a finite sequence of constants A = (n,---n,) from w 
such that W(x, 2) = ©°(A, x, g). In particular, a partial function g(X) is 
J -recursive if there is an operative functional ®(iu, x, f) in # and constants 
nw from w such that 


p(%) = O(a, x). 
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Our main aim here is to study these %-recursive partial functions, for 
various #’s. In order to obtain an interesting theory of #-recursion, we 
must impose some minimal closure conditions on the class f. 


2.3. Derinition. A class of functionals ¥ is suitable if # contains the initial 
partial functions and functionals (i}-(v) and is closed under the rules 
(viH(x) below: 

(i) Characteristic function on w: 


0 ifaea, 


p(a)= x(a) -{ 
1 ifago. 


(ii) Identity on w: 


a ifa€a, 
p(a) = 
0 ifagw. 


(iii) Successor function: 


a+1 ifa€a, 
g(a) = 


0 ifago. 
(iv) Characteristic function of equality on w: 
0 ifa=bEao, 
g(a,b)= 41 if aA#b anda,bEa, 
0 otherwise. 


(v) Evaluation: 
P(x, f) = f(x). 


It is understood here that @ is of signature (n, n), so that f(x) makes sense. 
We will omit such trivial side conditions in the sequel. 
(vi) Addition of variables: if ©(&, f) is in ¥, so is 


V(Z,4,5,h, f, 8) = BU f). 
(vii) Composition: if B(a, x, f) and X(x, f) are in ¥, so is 
W (Ef) = B(X(, f), % f). 
(viii) Definition by cases: if ©(%, f) and X(X, f) are in ¥, so is 


@(x,f) if a=0, 
W(a, x, f) = [x05 if a#0 anda Ea, 
0 ifa Zw. 
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(ix) Substitution of projection: a projection is any mapping of the form 
W(X1°* + Xn) = Mis 


where 1 <i <n. Thus (ix) asserts that if ®(y,---y,, f)isin ¥# and 77,--- 7m 
are projections, then 


W(%, f) = B(m.(Z) +++ 7 (¥), f) 


is also in S. 
(x) Functional substitutions: if ®(X, g::°+ gm) isin # and X,---X,, are 
in ¥ so is 


w(x, f) - P(x, AVX, x, f), os AY imXm (Ymns x, f)). 
We will use trivial consequences of these closure properties without 


explicit justification. For example, from (vii) and (viii) it follows that if 
®,, ®,, X are in SY, then so is 


: B(x, f) if X(%, f) =0, 
V(x, f) = . : 
®@,(x,f) if X(x, f) 40, 
where X(x, f) #0 abbreviates the statement 
X (%, f) is defined and has value # 0. 
Also from (v), (vi), (ix) and (x) it follows that if ®(x,---x,, fie-: fn) EF 
and 7, p are permutations of {1,...,n}, {1,...,m} respectively, then 


W (x10 + Xny fives fn) = P(Xe ay * Xen foay’ ** foom) 


is also in ¥%. Finally since f = AXf(x), (v), (ix) and (x) imply that if 
P(X, 21, 82, Z3) is in ¥, then so is 


V(X f8)= PALS Ayaly, x y)). 
2.4. We will now give two examples of nontrivial functionals which are 


f-recursive for any suitable S. 
Minimalization. Let 


®, (%, f) = wi(f(i, x) = 0) 
=the least iG w (if one exists) such that f(i, x) =0 
and for all j <i, f(j,X) is defined and f(j, x) 40. 


To see that ®, is #-recursive, let 
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f(a, x) if f(a, x)= 0, 
W(a, x8, f) = 
g(at+i,x)+1 if f(a,x)#0. 


Clearly Y € ¥ and it is easy to verify that ®, (x, f) = ¥°(0, x, f), so that ®, 
is #-recursive. 
Primitive recursion. Consider the functional ®, (a, x, fi, f2) defined by 


%,(a,%,fi,f)=0 ifageo, 
®, (0, X, fi, fo) = fil), 
®, (i + 1, x, fi, fo) = f(D, fi, fr), i, X) 
and assume ‘that ¥ is a suitable class which contains the predecessor function 


a-1 ifa€ow,az=l, 
a-l= 


0 ifagéwora=0. 
Put 
fi(x) if a =0, 
W(a,x,9,fi, fro) = f.(g(a ~ 1, xX), a, X) if a Ew, a#0, 
0 ifazw. 


Then ¥ € ¥ and it is easy to check that , = W*, so that ®, is ¥-recursive. 
We will see in the next section that we do not really need to assume that 
a~lisin S. 


2.5. Let %o= f(A) be the smallest suitable class of functionals on A. 
More generally, if = (,--- ®,) isa finite sequence of functionals, let 
If B] 


be the smallest suitable class of functionals containing ®,--- ®,. We will 
usually call the J[@]-recursive functionals simply recursive in ®. 
3. The basic constructions 

Our basic tool for constructing complicated inductions is the following. 
3.1. SIMULTANEOUS INDUCTION LEMMA. Let # be a suitable class of function- 


als on A, let ®o(Xx, f., fr, Z), ily, fi, fe. &) be two functionals in ¥ and define 
inductively 
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V5(%, Z) = Bol X, AX Wo(K’, Z), AV’ WFI(y’, Z), &), 
Wi(y, Z) = Dil y, AW (Z', B), AV VT’, Z), &). 
Then Vo, VW? are both #-recursive. 
Proor. Let 0 be a sequence of 0’s of the appropriate length so that the 
expressions below make sense and similarly for 1. Put 


_ Do(X, AX'F(O, F', 0), AF’F(1, 1, 
X(a, %, 9, f,Z) = 4 ,(y, AX'F(0, £', 0), AV’f(1,1, 9), 2) ifaeow, a¥0, 
0 ifaZw. 


Then X € ¥ and a trivial induction on é shows that 
Wz, B) = X40, 0,8), WEF, 8) = X*(1,1, 9,8), 
so that Wo, wi are #-recursive. (Strictly speaking we have Wo(x, g) = 
X™*(0, X,0, g), which is not exactly of the right form but it is immediate that 
Wal, 8) = X°0,0, % 8), 
where X(a, X, 9, f,8)=X(aJ, % Aa'y'x'f(a',y’, X'),2).) O 


As an immediate corollary of the preceding result we have the impor- 
tant: 


3.2, FUNCTIONAL SUBSTITUTION THEOREM, Let # be a suitable class of 
functionals on A. Then the class of #-recursive functionals is closed under 
functional substitutions, i.e., if P(X, 81-°* Bm), X1°** Xm are all S-recursive 
So is 


WE f) = BCS, AV XG, fs. AVmXm (Fo ¥ f)). 
Proor. Take m = 1 for simplicity and consider (x, g), X(¥, x, f). There 
are functionals ®,(i, y, x, h, f), Bi(, x, h’, g) in ¥ such that for some A, m, 
X(V,% f) = Oo(H, 9,5, f), P(g) = O7(M, % g). 
Consider then the simultaneous induction 
Wi(d, ¥, % f) = Bold, J, ¥ AU'W'E' WOH’, 9,5", F), fd 
Wid, % f) = B,(d, %, AO'Z' Wd", X, f), Ay’ Wo(A, y", % f)). 


By the Simultaneous Induction Lemma W7(d, %, f) is %-recursive. We 
claim now that 
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(*) W(%, f)= Wim, % f), 
which completes the proof. To see that (*) holds, check first that W§= @§, 
for all €. From this an easy induction on &, using the monotonicity of the 
functionals in their functon arguments, shows that (for each f) 
ABEW (a, x, F) C AdEDI(a, & AYX(Y, &, f), 
so that 
AKWF(m, ¥, Ff) C AWK, fF). 
For the other direction we prove by induction on € and using monotonicity 
again that 
ADEP Fd, X, AVOS (A, §, ¥, f)) C ATKW{(G, Zz, fF) 
so that 
ADTDF(G, ¥, AVX (, ¥ f)) C ATEW 75, % f), 
thus 
AW (%, f) CAEWT(M, &, f) 


and we are done. OJ 


This result takes a particularly simple form when we substitute a partial 
function: if ©(%,g,f) and g are ¥-recursive, then so is W(x, f)= 
P(x, 9, f). 

Using the Functional Substitution Theorem and the trivial fact that every 
functional in # is also %-recursive, we can immediately see that the class of 
J -recursive functionals has all the closure properties of ¥, so in particular 
it too is a suitable class. For example, to prove that it is closed under 
composition consider the functional 


De (%, fi, fo) = fafi(%), ¥). 


Clearly &. € F by 2.3(v), (vii). So if (a, x, f), X(%, f) are Z-recursive, it 
follows from the Functional Substitution Theorem that 


V(z, f) = O(X(%, f), % f) 
= Bc (K, AX'X (%', f), Aax' Pa, z', f)) 


is also #-recursive. Similarly, using 2.4, the class of %-recursive functionals 
is closed under minimalization and primitive recursion, at least when a ~ 1 
isin J. 

Our last result in this section is probably the most important closure 
property of the class of #-recursive functionals, for suitable ¥. It allows us 
to use freely %-recursive functionals in constructing complicated induc- 
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tions and will be used repeatedly in the sequel. It is also called the First 
Recursion Theorem for functional induction. 


3.3. THE INDUCTION COMPLETENESS THEOREM (MoscHovakis [1976]). Let # 
be a suitable class of functionals on A. If ®(x,g,f) is an operative 
JI-recursive functional, then the fixed point &7(x,f) is also #-recursive. 


Proor. Let ®(x, g, f) = W7(A, %, g, f), where W(i, %,h, g, f)E J and Aisa 
sequence of constants. Put 


X (a, X,h, f) = Va, %, h, Ax'h (A, Z'), f); 
clearly X € J. We claim that 


(+*) *(%, f) = X°(i, & f), 
which will complete the proof. 

To prove (**), show first by an easy transfinite induction on € (using 
monotonicity) that 


X* (a, ¥, f) = w > Wi (a, % AZ'O~*(z', f), fF) = w, 
thus 
AEX" (A, % f) C ARW"(H, % AX’ O°(Z', f), FI 
= AXD(%, AX'D"(z', f), f) 
= AXO*(K, f). 
For the other direction, prove by induction on é (using monotonicity 
again) that 


Wi (a, X, AX'X"(A, x’, f), f)=w D> X*(4,%, f)=w, 
so that 


AVW"(H, X, AX'X (A, &', F), f) = AZD(K, AX'X (A, Z', f), Ff) 
CAR X*(A, 5", fs 
by the minimality of ®” then (see 2.1) 
ATD™(Z, f) C AEX" (A, %, f) 
and we are done. [1] 
As a simple application of this theorem, we can now eliminate the 


assumption that g¢(a)=a-1 is in ¥ from the proof that the class of 
£-recursive functionals is closed under primitive recursion (see 2.4). 
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Indeed the Induction Completeness Theorem tells us that it is sufficient to 
show that a= 1 is ¥-recursive. But 


0 if a =0, 
a~1=4uifi+l=a] ifa€o, a#0, 
0 if aZo, 


so a~1 is ¥-recursive. 


4. Relations with ordinary recursion theory 


Before we proceed to develop further the theory of functional induction, 
we will consider in this and the next section some interesting examples and 
we will establish connections with some classical aspects of recursion 
theory. 


4.1. Let A be an infinite set and suppose that wa C B CA. A functional 
@(x,f) on A concentrates on B if for each *€B" and each f= 
(fies + fn) € PACA) X +++ X PA, (A) we have 


(x, f) = D(z, fT B), 
where f| B =(f:[ B",..., fn |B“). If J is a class of functionals on A, let 
£{B be the class of all functionals on B which are restrictions to B of 


functionals in % which concentrate on B. Here the restriction ®| B of a 
functional ©(z, f) on A is the functional on B defined by 


P| B(%, f) = (5, f) 
for x E B", fE PI,,(B)X-++* AF,,,(B) and we have 
J(B={HB: ®E ¥, ® concentrates on B}. 


Notice that if # is a suitable class of functionals on A, then #[B is a 
suitable class of functionals on B. 

For example, every functional in S%o(A)= smallest suitable class of 
functionals on A, concentrates on B, so 


JAA )T B= J(B). 


More generally, if # = (A )[®], where each functional in concentrates 
on B, then ¥(A)[®]! B = 4(B)[®! B], where if ® =(#,--- ®,), then 
®|B=(%,|B---@,}B). Notice also here that if every functional in # 
concentrates on B then the #[B-recursive functionals are just the 
restrictions to B of the #-recursive functionals. 
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Specializing the above to the case B = w we see that for any A the 
Jo(A ){ w-recursive functions are just the %o(w)-recursive functions. It is 
now easy to identify these as follows: 


4.2. THEOREM. A partial function f : @" — w is $o-recursive if and only if it 
is recursive. 


Proor. That every recursive partial function on w is So-recursive is 
obvious from the closure properties of the class of So-recursive partial 
functions. For the converse, notice that if (x, f) is an operative functional 
in %o, then the associated operator O(f) = AXxP(X, f) is a recursive operator 
in the sense of ordinary recursion theory, so its least fixed point ®*(x) is 
recursive by the Kleene First Recursion Theorem (see Rocers [1967]). © 


5. Recursion in type 2 objects and quantifiers 


We will give in this section a number of important types of functionals 
relative to which we want to study recursion. 


5.1. Let A be an infinite set with w C A. A type 2 object on A is a total 
mapping 

F:* >a, 
where w* is the set of all total functions from A into w. Every type 2 


object F can be naturally identified with a functional of signature (0, 1) on 
A, namely 


F(f) if f: A >a is total, 
F(f) = 


undefined otherwise. 


Clearly this is monotone since if f € g and F(f) = w, then f is total so also g 
is total and thus f = g, therefore F(g) = w. 

Of particular interest is the type 2 object E = E, which embodies 
existential quantification over A: 


0 if Ax (f(x) =0), 
E(f) = 
1 if Vx (f(x) 40). 


Another interesting type 2 object is the Tugué object E,. This is defined 
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relative to a coding of tuples on A, i.e. a 1-1 mapping (): U, A" >A. 
For each total f, 


0 if Bao, a,... Wn (f((ao* + * and) = 9), 
i(f) = 


1 if Vao, ai,... dn (f((ao-+ > and) 40). 


To illustrate the present ideas and establish a further connection with 
classical notions in recursion theory we will consider in some detail 
recursion in E = E,. We need a definition first. 


5.2. DEFINITION. Let ¥ be a suitable class of functionals on A. A relation 
RCA" is %-semirecursive if there is an #-recursive partial function 
f:A"—wo such that 

R = domain(f). 


R(x) & f(z) 1 
& f(X) is defined. 
In case # = Fol P], we will call the $-semirecursive relations simply 


semirecursive in ®. In case A = w, © =, this terminology agrees with the 
classical one in view of 4.2. We now have: 


5.3. THEOREM (KLEENE [1959]). Let E be the type 2 object which embodies 
existential quantification on w. A relation on w is semirecursive in E if and 
only if it is TI}. 


Proor. Let R(x) be semirecursive in E and let (i, x, f) € A,[E] be such 
that for some fixed i, 
R(z)@ ®°(H4, x) 1. 
Then 
R(x) © dk (®*(H, x) = k) 
& Ak Wf[Wa' O(a’, &', f) = f(a’, X’) > f(A, £) = k]. 
Identifying partial functions on w with their graphs we can easily calculate 


that R is II. 
For the converse, assume R(x) is a II} relation, say 


R(X) & Va An P(X, a(n)). 
Here a varies over w*, &(n) = (a(0)--- a(n —1)) with 


(Ko Sate kn-1) = po" a pie! oss aes 
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where p; = i-th prime and P(x, u) is a recursive relation with the property 
that if P(x,a@(n)) & m =n, then P(x, a(m)). Define 


0 if P(z,u) & Seq(u), 


W(u, xf) = 
1+ E(At(1— f(u*t,x))) if 4 P(x, u) v 7Seq(u), 


where Seq(u) @ Jko-- + kn-1 (u = (ko+ ++ Kn-1)) and 
(ko+ ++ Knit) if u = (ko+ ++ kn-1) for some ko+ ++ Kn-1, 
ur*t = 


0 if — Seq(u). 


By convention 1={ )= code of the empty sequence, so that Seq(1). Since 
Seq, * are recursive, it follows that VW is recursive in E, so by the Induction 
Completeness Theorem 3.3, ¥”(u, x) is also recursive in E. It is easy now to 
check by induction on é that if Seq(u) then 


W*(u, Z)=0 > Va Du AnP(3, a(n)), 


where a Du © 3t (a(t) =u). On the other hand if ¥"(u, x) is undefined 
or defined and #0, then — P(X, u) holds, so for some 4, ¥"(u * t, X) is 
undefined or is #0. Repeating this we get an a@ Du such that for all n, 
—1 P(X, &(n)), so for Seq(u) we have 

wW"(u, X)=0 & Va Du sanP(X, a(n)), 


therefore R(x) @ W"(1, x)=0. Thus R = domain(¢g), where 


0 if ¥7(1,z)=0, 
g(x) = 


undefined if (1, x) 40, 


and R is semirecursive in E. O 


We shall prove in Section 9 that a function ¢ is recursive in E exactly 
when its graph is II;. 


5.4. The second type of functionals we are going to consider here 
originates in the notion of a quantifier. 

A quantifier on a'set A is a collection Q of subsets of A with the 
monotonicity property 


XCYCAKEXEQSD YEQ. 


To avoid trivialities we assume also that 6S Q&power set of A. It is 
customary when dealing with quantifiers to write interchangeably 
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{x: R(x}JHEQ S Q({x: R(x)} & Qx R(x). 
The dual quantifier to Q is defined by 
QxR(x) @ AQx 4 R(x), 


i.€., 
XEQS(A-X)EQ. 


The standard examples of quantifiers are of course the existential 
quantifier J= 3, ={X CA: X# 9} and its dual, the universal quantifier 
V=V, ={A}. Another interesting quantifier is the Suslin quantifier 
(relative to a coding of tuples on A), 


Sx R(x) & WxoWx1- + AK R((x0°* + X%)). 


Every quantifier Q can be essentially identified with the type 2 object Fa 
defined as follows: for total f: Aw 


0 if Qx (f(x) =0), 
Fa(f) = , 
1 if Qx (f(x) #0). 


Under this identification E corresponds to 3 and E, to &% Of course there 
are many type 2 objects which do not correspond to quantifiers. 

Every quantifier Q gives also rise to another functional of signature (0, 1) 
Fa, 


0 if Qx (f(x) = 0), 
Fa(f) = 41 if Qx (f(x) 4 0), 
undefined otherwise. 
For example (recalling that F3= E), 
0 if dx (f(x) =0), 
E*(f) = 41 if Wx (f(x) 4 0), 
undefined otherwise. 


Cleary Fo is the restriction of FG to total f : A > » and it is not hard to 
see that Fo is recursive in FG, E*. Indeed, 


Fo(f) = Fa(f): ¥(f), 


1 if f is total, 
ep) = | 


undefined otherwise. 


where 


Now ¥ is recursive in E® since W(f) = E*(Ax(f(x)+ 1)). In particular E is 
recursive in E™ so that every function recursive in E is also recursive in E”. 
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In general, for most A, there are functions recursive in E* which are not 
recursive in E. An exception is A = w, in which case the functions recursive 
in E and E” both coincide with the functions with Il} graph; see 5.3 and 
9.5. 

For those familiar with the theory of inductive definability we mention 
the fact that if 21 = (A, R,--- R,) is an acceptable structure, then a relation 
on A is semirecursive in FZ, E* and the characteristic functions of =, 
R,-:-+R, if and only if it is absolutely £*(Q)-inductive. For the definitions, 
see MoscHovakis [1974a]. 

Recursion relative to Fg and F@ for quantifiers Q on w was introduced 
by Hinman [1969], who also established there its main properties (including 
the version of 7.2 in the context of recursion in F4). See also AczEt [1970]. 


6. The Stage Comparison Theorem 


6.1. Let ®(x, f) be an operative functional on A. Consider ®*(x) and 
associate with each x for which ®©*(x) is defined the ordinal 


|< |» = least € such that @*(x) |. 
We also agree to put 
|t|o =~ =card(A)’, 
when ©*(x) ft, where 


f(%)t & f(X) is not defined. 
Thus, 
|Z lo <moe OX). 


Given now two operative functionals (x, f), W(y, g) let 
¥<bvy OS O(X)L & [Flo <lFlv, 


E<3v9 O O(Z)| & Flo <[Fly O Elo <|Fly, 


0 if €<3v/, 
y= 


1 if ¥ <tex; 


x(%, ¥ 


we call x the stage comparison partial function for ®, ¥. 

If ®, ¥ are in some suitable class ¥, it is natural to ask whether the 
associated y is ¥-recursive. This fails in general, but holds when ® and 
are normal according to the following definition due to MOsCHOVAKIS 
[1976]. 
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6.2. DEFINITION. Let # be a class of functionals on A and let (x, f) be a 
functional in ¥. We call ® normal in F if there is an #-recursive functional 
Ao (x, f, 8) taking values 0, 1 only such that 

(i) if B(x, FI{¥: 5(¥) =0}) |, then As(¥, f, 8) =0, 

(ii) if 5 is total, {¥: &(¥) = 0} C domain(f), and (x, f [{¥: 5(¥) = 0} f, 
then Ao(x, f, 5) =1. 
Here the vector notation is used in the obvious way: if f= 
(fit ++ fn) © PIX +++ X PH,, then &=(6,---6,) varies also over 
PI, X +++ X PI, 


FV: 5G) =O} = Fi Fu: 6H) = OF, - «5 fm HL Fm = Sm (Fm) = O}) 


and {y: 5(y) = 0} C domain(f) means that for all 1<i=<™m, {¥:6(¥:) = 
0} C domain(f;). 

We call a class of functionals % normal if every functional in # is normal 
in S. 

The following result is due to Moscuovakis [1976] and its proof is 
patterned after a proof of Aczel-—Kunen of a similar result in positive 
elementary inductive definability; see MoscHovakis [1974a]. 


6.3. THE STAGE COMPARISON THEOREM. Let # be a suitable class of func- 
tionals on A and let ®(x,f), Y(¥,g) be two functionals in # which are 
normal in ¥; then the stage comparison function of ®, WV is #-recursive. 


Proor. Let Ao (x, f, 5), Av (Y, g, €) be the functionals associated with ®, ¥ 
in the definition of normality 6.2. We will find a functional X (x, y, h) with 
values 0, 1 whose least fixed point X°(x, y) has the property that if x <3, vy, 
then X"(x, ¥) = O and if y <%, eX, then X“(x, y) = 1. From this we can easily 
get the stage comparison function, 
O"(x)-0 if X°(x, y) =, 
4 y) > . _ 
e(¥"(y)) if X°(% y¥)= 1, 
where ¢(a)=1 for all a. 
Heuristically, X is found as follows: dropping subscripts to simplify the 
notation, 
E509 © OZ) L O OK, O*")] 


& B(x, BTR": [¥’|<[¥ |} L- 
Now 
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le1<l¥> VU) tT OVO Vy). 


1 if h(z)=0, 
aKa | 
0 if h(z)=1, 


Thus letting 


we have 
X(% VY) =O D> DK, OEE WY, BPH LHI </EIP TDI 


> Ao(K, B”, AX’ TAv(y, WAY’ 3 yx (%’, ¥’))) = 0. 
Similary 
X(%V)=1> Ao (%, ©”, AK’ TAL, YW, AY’ Ty (X', ¥’)) = 1. 


So if ®"(¥)) or W°(¥)) we have 
X(%, ¥) = Ao (%, B*, AX’ TA (¥, BW", AV’ 7 x (¥', yD). 
This suggests taking 
X (x, ¥,h) = Ao (x, 8", AX’ Ay (y, BW", AV’ TAK, ¥’))). 
Clearly X is an #-recursive functional by the Substitution Theorem 3.2, so 
by the Completeness Theorem 3.3, X”(X, y) is also ¥#-recursive. To prove 
that X™ has the required properties one proves first by induction on é that 
¥ Savy & |Xlo =F D> X*(%,9)=0 
and then again by induction on é that 
¥<b.0x& |yly =F > X*(KY)= 1. 
We leave the details to the reader. (J 
We will see in a moment that the examples of functionals we discussed in 


Section 5 lead to normal classes, but let us first establish the following 
simple criterion for normality. 


6.4. Proposition. Let ® be a finite sequence of functionals on A; if every 
functional in the list ® is normal in ¥.[®], then $[P] is normal. 


Proor. Assume that every functional in ® is normal in ¥.[®]. We prove 
that every functional in S.[®] is normal in S.[®] by induction on the 
construction of $.[®]. 

Cases 2.3(i}-(v), i-e., the initial functionals, are trivial. For example if 
@ (x, f) = f(x) we take 
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0 if 5(x)=0, 
Ao (x, f, 6) = 
1 if 6(x) #0. 
Also 2.3(vi), (viii), (ix), (x) are straightforward. Finally, for 2.3(vii), 
assuming that for ®(a, x, f), X(x, f) we have already found Ag, Ax with 
the required properties and letting W(x, f) = &(X(x, f), x f), we take 
7 Ao (X(x, f), 68) if Ax (x, f, 8) =0, 
Av (x, _ 5) _ at, oes 
Ax (x, f, 8) if Ax(%, f,6)40. O 


Call a finite sequence of functionals ® on A normal if each functional in 


@ is normal in ¥[®]. 


6.5. THEOREM. (i) For every finite sequence F of type 2 objects on A the 
extended sequence F, E is normal. 
(ii) Let Q be a quantifier on A; then F% is normal. 


ProoF. (i) To each type 2 object F in F assign the same 


Ar(f,6)=1~ E(Ax(1~ 8(x))), 


1 ifa=0, 
l1-a= 


0 ifa4¥0ora€¢w. 


where 


(ii) Let Q be a quantifier on A and let ®=F% be the associated 
functional. Put first 
1 if (x) 40, 


W(x, f, 8) = F if 5(x) =0, f(x) 40, 
0 if 5(x)=0, f(x) =9, 


and 
0 if 5(x) 40, 
X (x, f, 6) = ( if 5(x) =0, f(x) 40, 
0 if 5(x)=0, f(x) =0. 
Now let 


O if FS(Ax¥(x, f,8)) =0, 
Ao(f,8) = 4 0 if FS(Ax¥(x,f,8)) =1 & FS(AxX (x, f,8)) = 1, 
1 if FS(AxW(x, f,5))=1 & FS(AxX(x, f,8)) =0. 


To prove that A» works assume first that FG(f t{x: 5(x) = 0}) |. Then 
either Qx (f(x) =0 & 5(x)=0) or Qx (f(x) #0 & 5(x)=0). In the first 
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case F4(AxW(x,f,5))=0 and in the second FG(AxW¥ (x, f,6))=1 and 
FG(AxX (x, f, 6)) = 1, so that in either case Ae (f, 5) = 0. If now 6 is total, 
{x: 5(x)=0}C domain(f) and F4(f t{x: 6(x)=0})7 then we must have 
that both Qx (f(x) =0 & &(x)=0) and Qx (f(x) 40 & 5(x) =0) fail, so 
that Qx (f(x) 4 0v f(x) t v 6(x) 40) and Qx (f(x) = Ov f(x) tf v d(x) 40) 
hold. Noticing that f(x) f > 8(x) #0 we have Qx (f(x) ¥ 0 v 6(x) 4 0) and 
Qx (f(x) =0v &(x) 40), so that FO(Ax¥ (x, f,6))= 1 and FG(AxX (x, f,6)) 
=Oie. Asef, 6)=1. O 


7. Semirecursive relations 


7.1. Let # be a class of functionals on A and recall from 5.2 that a relation 
RCA" is called %-semirecursive if it is the domain of an #-recursive 
partial function 9: A" —w. We call a relation RC A” S-recursive if its 


characteristic function 
0 if R(x), 
Xr (x) = 


1 if R(x), 


is #-recursive, where “R=A"— R. 

Here we discuss some structure and closure properties of the class of 
f-semirecursive relations for suitable, normal ¥. This study will be 
completed in Section 9 after we prove enumeration. 

Given a relation RC A", a norm on R is a map 0: R > ordinals. We 
also agree to put o(x) = © = an ordinal bigger than all o(y) with R(y), if 
— R(x) holds. To each norm o on R we associate the relations 


i<*y & R(X) & a(x) <a(¥) 
¥<tY & R(X) & a(X)<a(¥) & a(%)< oF) 


If F° is a collection of relations on A and o is a norm on R we call o a 
I’-norm if both =3, <3 are in I. We say that I" itself is normed or has the 
prewellordering property if every relation in FT admits a [’'-norm. 


7,2, THE PREWELLORDERING THEOREM. Let # be a suitable, normal class of 
functionals on A. Then the class of #-semirecursive relations is normed. 


Proor. Let R(x) @ g(x) 1, where ¢ is ¥-recursive, let p(x) = &°(H, £), 
with ® € ¥ and put a(x) =| A, X|e (see 6.1). Then o is an %-semirecursive 
norm on R by the Stage Comparison Theorem 6.3. ( 
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A class of relations [ on A is closed under v if for all R, SC A" in I, 
the relation 


(R v S)(%) @& R(&) v S(z) 


is also in I. Similarly for & and 7. 


7.3. THEOREM. Let # be a suitable, normal class of functionals on A. Then 
(i) The class of #-semirecursive relations is closed under &, v. 
(ii) A relation R is $-recursive if and only if both R and —R are 
S -semirecursive. 


Proor. (i) Closure under & is easy and does not need normality. This is 
because 


p(x) ) & WZ) & (p(%)+ w(K) I. 
For v, let 
R(Z)@ o(%) 1, SE) We) 1 
with gy, w ¥-recursive. By the Stage Comparison Theorem we can get a 
function x (x1, X2), where X,, ¥2€ A”, which is defined exactly when either 
(x1) ) or o(X2) |. Then 
R(X) v S(Z) & x(%¥) 1 
and we are done. 
(ii) Assume R, —1R are JS-semirecursive. Let R(x) o(x)1, 
4 R(x) & W(X) | and find again y(%,, ¥2) Z-recursive such that if 9(%,) J 


and (x2) f, then y(X:, #2) = 0 and if w(%.) | and g(x.) f then x(%,, 2) = 
1. Then we must have yr (x)= x(x, x), so R is #-recursive. [J 


For example in view of 6.5 and 5.3 we have that a relation on w is 
recursive in E if and only if it is A}, i.e. hyperarithmetic (KLEENE [1959]). 
8. The Enumeration Theorem 

Our purpose in this section is to prove the existence of universal 
£-recursive partial functions for appropriate %. We recall the definition 


first. 


8.1. Let F be a class of partial functions (of several arguments) on A, such 
that w CA. We say that F has the enumeration property or is w- 
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parametrized if for each n =1 there is some g(a, x,°--x,) in F such that 
the class of all n-ary partial functions in F coincides with {y,: e € w}, 
where for any partial function f, 


fAXin* + Xa) = fl@ xe? Xn): 


If f = g., we call e a code or index of f (relative to ¢). A partial function » 
as above is called universal (for the n-ary partial functions in F). 

We will prove in 8.4 that the class of %-recursive partial functions has the 
enumeration property provided that ¥ is suitable, finitely generated, and 
admits a coding of tuples. We now explain these notions. 


8.2. Given a set A such that w CA, a coding for tuples on A is a 
one-to-one function @ =( ) from the set of all finite sequences from A 
(including the empty sequence) into A. With each such coding we associate 
the following total functions (from A into w) and (total) mappings from A? 
or A into A. 


0 if Fxo°+ + DxXn-1 (x = (X0°** Xn-1)); 
(i) Xseq(X ) = X Seq(X ) = 


1 otherwise. 


n if Bxo+ ++ DxXn1 (x = (X0°+ + Xn-1)), 


(ii) ms)= 8) = | 


0 otherwise. 


Xi if Dxo+ + Dxn-1 (x = (X0++ * Xn-1)); 


0 otherwise. 


(iii) wi attaated ={ 


(iv) (Xort+ Xn-1Yo"** Ym -1) if Ax0°+ + Xn-1 (x = {Xo° ee Xn-1)) 

x*¥y=xey = & Ayors+ Vm (Y = (Yo*** Ym-1))s 
0 otherwise. 

(v) s(x) = s*(x) = (x). 


Given now a class of functionals # on A we say that % admits the coding 
of tuples © if ys.q, Ih* are S#-recursive and the class of S-recursive 
functionals is closed under substitution by (x), x *“ y, s*. This means that 
if P(x, y, f) is £-recursive and ¢ is any of these mappings then ¥(Z, y, f) = 
@(t(Z), 9, f) is also J-recursive. 

Note now that if a suitable # admits the coding @, then we can assume 
without loss of generality that @ =() agrees with the standard coding 


(ko ++ Kn-1) = peo! wee knit! 


n-1 
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on w. This is because if we define €’=( )’ by 


r((Xo°**Xa-1)) if at least one of xo--+ X,-1 is not 
in w and (xX9°°* Xn-1) E @, 


(Xo°** Xn-1)) =4 (X0°** Xn-1) if at least one of xo°++x,-1 iS not in w 
and (Xo°**Xn-1) Z @, 


i ae . 
poss: pi"! if all x; are in @, 

where r: w—w is a recursive 1-1 function whose range is recursive and 
avoids {po%' +++ p*»-""!: n, ko ++ kn-1 € w}, then ¥ admits also €’. The only 


nontrivial thing to check here is that the relation 
P(x) © yvS(x) =0 & Wi <Ih*(x)((x)F E w) 
is %-recursive. For that, notice that 
P(x) & xyS(x) = 1 v wily. ((x)*) = 1] < hf (x). 


From now on we will agree that any coding agrees with the standard one on 
w, so that notations such as (e9--+ e,-1), (e); etc. are unambiguous, if e, 
e €w. We also agree that 


(@) = code of empty sequence = 1. 


Note now that if # admits the coding ©, then for each fixed n =1 the 
class of J-recursive functionals is closed under the mapping 
Pn(Xo'** Xn-1) = (Xo'* + Xn-1). Also the class of S#-recursive functionals is 
closed under substitutions by the mapping 


t 


(x) -+(x))) if0sisj Eo, 
wid=| 


otherwise. 


For that, one proves first by primitive recursion on j that if ®(x, y, f) is 
#-recursive then so is 


W(x, i, 5,2, 9, f) = B(t(x, i,j) *z, 9, f). 


and then puts z = 1 = (). 
We conclude the preliminaries to the proof of the Enumeration 
Theorem with the following definition. 


8.3. Let # be a suitable class of functionals. We say that # is finitely 
generated if there is a finite sequence of functionals ® and a finite sequence 
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of total mappings t =(t,--:%&) where 4: A*\> A, such that # is the 
smallest suitable class of functionals containing ® and closed under 
substitutions by mappings-in t. 


8.4. THE ENUMERATION THEOREM. Let J be a suitable class of functionals on 
A. If 9 is finitely generated and admits a coding of tuples, then the class of 
S-recursive partial functions has the enumeration property. 


ProoF. It will be clearly enough to find an ¥-recursive universal partial 
function ¢(e,x) for the unary #-recursive functions. Our plan is the 
following: We associate in some canonical way a code (Gédel number) 
'@'Ew to each functional ® in ¥; then we construct an #-recursive 
functional U(e, x, f) with f unary such that if 'O(x,--+ x. fic+-fm)' =e 
then 


(x10 + Xn, firs + fin) = U(e, (a1 ++ Xn), (fi fad)» 
where (f,-*+ fn) = Ax(fi(x) +> fm(x)), with f(x) = f((x)o-*()u-1), if 


f:A*—o. We arrange things so that for some recursive p:w—w 
p('P(x,+++ xx, f)') =n. Then letting (for g: A’ @) 


Vie, x, g) = Ule, x, Ax(g(e, (x )o* > * (%)pie=1))))s 
we have for any operative ®(x,---x,,f) with '6'=e and p(e)=n 
P(X1 12+ Xm AKL XA (E (XT XA) = 
= U(e,(x1°° + Xn), (AXE XB (e, (x1+"* Xn)))) 
= Ule,(x1+ ++ xn), Ax (2 (6, (x)o* ++ (X%)n-1)))) 
= V(e,(x1+** Xn) B)s 
so that we can prove by induction on & that 


Vi (e,(x1°°° Xn) = PF (x1 +++ xn), 
therefore 
V"(e, (x1 °° + Xn) = B(x Xn). 
Then 
ple, x)= V7((e)o, ((e)r* + + (€ )incey=15 ¥)) 


is the required universal function. 

The way to define '®!' is more or less obvious by induction on the 
construction of ¥. Then we define U(e, x, g,f), by considering cases on 
e ='@!, so that U"(e, x, f) = U(e, x, f) has the required properties. It will 
help to notice here that since ¥ is finitely generated there are finitely many 
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functionals ®,---®, in # and total mappings t,---4 such that ¥ is the 
smallest class of functionals containing ®,---@®, which is closed under 
substitutions by t,:--% and satisfies (i}(ix) of 2.3 and (x) for 2,---@, 
only. The actual details are straightforward but a bit messy and we omit 
them here. O 


9. Consequences of enumeration 


There are several corollaries of the enumeration theorem which we 
consider first. At the end of this section we will also look at some 
consequences of the enumeration theorem for the structure theory of 
semirecursive relations. 


9.1. Let F be a class of partial functions on A. A universal system for F is a 
sequence {y"},2: of partial functions on F such that for each n, ~” is 
n+ 1-ary and universal for the n-ary partial functions in F. We call {@"} 
good if for each n, m = 1 there is a total recursive function Si(e, ki: kn) 
on w such that for each k,---k, Ew, x1°°' xX, EA 


e™*"(e, a aes King Xi°°° Xn) = e" (Sale, ky Bene, kn) Mice Xn). 
We call these the s-m-n functions of {p"}. We now have: 
9.2. THE s-m-n THEOREM. Let ¥ be a suitable class of functionals on A 


which is finitely generated and admits a coding of tuples. Then the class of 
S-recursive partial functions admits a good universal system. 


Proor. By 8.4, let g(e,x, y) be an #-recursive partial function which is 
universal for the binary %-recursive partial functions. Then let 
p*(@, X17 Xe) = P((E Jo, ()1, (x1 ++ + Xe), 
where ¥ admits the coding of tuples (). Let 
F(x, y=" ((X)o- + + (XK) (Yo * * (Y)n-1) 
and find e, such that 


f(x, y)= ¢ (Eo, x, y). 
Then 
Sr(e, ky-:- km) = (€o, (e, ki-:- kn )) 


is the required s-m-n function. O 
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A basic corollary of the s-m-n theorem is: 


9.3. THE (SECOND) RECURSION THEOREM. Let ¥ be a suitable, finitely 
generated class of functionals on A, which admits a coding of tuples. If {p"} 
is a good universal system for the class of $-recursive partial functions, then 
for each $-recursive partial function w(a, X) there is some e € w such that 


ge(X) = we, x). 


Proor. Let eo be such that ¢.,(a,%)= W(S.(a,a),x) and take e= 
Si(€0, eo). O 


Now we can complete the study of the general structure properties of the 
#-semirecursive relations which we started in Section 7. We say that a class 
of relations R on A is closed under 3” if for each P(x, a) in R, the relation 


A°P (x) & AnP(x,n) & An(n Ew a P(x, n)) 
is also in R. 
9.4. THEOREM. Let ¥ be a suitable, finitely generated class of functionals on 
A which is normal and admits a coding of tuples. Then 
(i) The class of ¥-semirecursive relations is closed under 3”. 


(ii) A partial function 9 : A" — w is #-recursive if and only if its graph is 
#-semirecursive. ( 


Both of these results are immediate consequences of the following basic 
lemma. 


9.5. NUMBER SELECTION LEMMA. If # is a suitable, finitely generated class of 
functionals on A which is normal and admits a coding of tuples, then for 
each #-semirecursive relation P(x, a) there is some J -recursive ws such that 


AnP(x,n) > o(%)) & P(x, W(X). 


Proor. Let y(e, x,a) be a universal #-recursive partial function by 8.4. 
Let o be a ¥-semirecursive norm on W = {(e, x, a): ¢(e, x, a) |} and let 
x(e, x, a, e', X', a‘) be ¥#-recursive such that 


(e, X,a)s2(e', x',a') & x(e, %, a, e', X',a')=0, 


(e’,X',a')<4(e,x%,a)& x(e,x%,a,e',x',a')=1 
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by 6.3. Then using the Recursion Theorem 9.3 we can find an e € w such 
that if 


P(X, a) © ¢(e,%,a)1, 


then 
0 ifaw & y(eo, X, a, e,X,a + 1)=0, 
ge(X,a) = fearn-i ifae€w& x(e,*,a,e,%at+1)=1, 
0 if a Zw. 


Finally, put 
U(X) = ge (%, 0). 

To prove that & works notice first that if p (eo, ¥,a) |, then .(%,a) | or 
ge (%, a +1) . Moreover if ¢.(%, b) | , then 9. (¥,c) | for all c = b. Thus if 
Fag(eo,x,a)), clearly o.(¥,0)). Let a be least such that ¢.(%, a) is 
defined by the first clause above; this exists since otherwise ¢. (x, 0) = 
ge (X,1)+1= — (%,2)+2=---. Since for any 1Sb<a, 9.(%,b-1)= 
¢.(X, b) we have ¢.(, 0) = ¢.(%, a)+ a =a and ¢(éo, x, a) |, which com- 
pletes the proof. O 


As a simple application we see for example that a function on w is 
semirecursive in E iff its graph is II} (see 5.3). Similarly for E”. 

The basic idea of proving selection theorems in order to establish 
structure results is due to Ganpy [1967]. 


RECURSION IN HIGHER TYPES 


10. The type structure over w 
10.1. For each integer j, define the set T” of objects of type j (over w) by 
the induction 
T= 
T¢*” = the set of all total functions|from T into w. 


We will often use superscripted variables a’, B’, y’ over T® to indicate the 
type, when it is not clear from the context. The members of 


T= U; T® 


are the objects of higher type over w. 
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A point is a finite sequence x = (x,--:x,) of objects of higher type; the 
type of (x,---+x,) is the maximum of the types of the x,;’s. The concatena- 
tion of two points x =(x.°++ xn), y =(yi'*' Ym) is the point 

Oy =X, y = (X10 Xap Ya Ym) 


We will study recursion in higher types by applying the general theory 
developed so far to the set 


A = T* =set of all points, 
relative to some appropriate classes of functionals on T*. As usual, we will 
identify the one element sequence (a@’) with a’ itself so that we have 
T C T*; in particular w C T*. 
Moreover, every product of the form 
L=TxK-- x THe 
is also contained in T*, so we can think of partial functions 
f:2%oow 


as partial functions on T* whose domain is contained in 2. We are 
ultimately interested in studying recursive partial functions on these spaces. 
By definition, if 2 = X,x---x X, and Y= Y,X:--x Y,,, then 


RXY=XK,X+**K X, xX Y,X°*+ X » ae 


so the collection of these spaces contained in T* is closed under cartesian 
products. 


10.2. We will need some simple mappings which embed T” into T“’ when 
jk (type increasing functions) and their inverses (type decreasing 
functions). Put first 


u'(n) = Am(n), do(a') = a'(0); 
clearly, 

ul: TOP> T dy: T°> T, 
u' is one-to-one and for all n, 

d.(u'(n)) =n. 
Now, by induction, let 
ul(a!*') = AB a’""(d(B""")), 
dj..(a’**) = AB/a!(u*"(B’)) 
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and verify easily that u/*? is an injection of T%*” into T¢*” and for all 
atte THD 


dj.(u'*?(a!*")) =aqit', 
Mappings 
ub*®§:T8>T® G <k), 


d,.:T?>T” (Gi <k) 
are defined easily by iterating these, i.e., 
ull = yl, dij = 
ublt? = ylt?oyit djj+2= Go dja, 
etc. It is also convenient to let 
u’i=d,, =the identity on T. 


Now all the u** are one-to-one and d,,(u"*(a')) = a’, j =k, a’ € T"). 


11. Kleene classes of functionals on T* 


We now come to the main notions which underlie recursion theory in 
higher types. 


11.1, Derinition. A class # of functionals on A = T* isa Kleene class if it 
is suitable, contains the functions (i)-(iii) and is closed under the rules (iv), 
(v) below. We use variables x, y, z,... for members of T* and a, 7 for finite 
sequences of members of T* (including the empty sequence). 

(i) Type specifying function: 


0 ifjEo,xET, 
XG, x) = 


1 otherwise. 
(ii) Length function: 
length(x)=n_ if x =(x1-°* x). 
(iii) Application of type 1 objects: 
x(n) ifxE€T, nEa, 
p(x, n) = 
0 otherwise. 


(iv) Application of type > 1 objects: If ®(x, y, o, f) isin H, then for each j 
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y(Aa'P(a', y,0, f)) if y€ Te, 


¥,(y,0, f) = Vly, 0, f) = 
0 if ye TO, 


is also in X. It is understood here that if y € T°*”, then W(y, oa, f) is 
undefined unless Aa’P(a’, y, o, f) is total. 


(v) Substitutions by concatenation and point projection: Recall that the 
concatenation mapping is the total mapping (from T* x T* into T*) 
txyy=Hx"y =x, y. 
Point projection is the total mapping from T* x T* into T* given by 
x if x =(xo°++x,-1), FE w &i<n, 
p(x, i) = 
0 otherwise. 
Using (iii) and (iv) we can prove by induction on j =1 that 
x(y) ifxEeT”, yET”, 
o(x,y) = 
0 otherwise, 


is also in &. 
Suppose now X is a Kleene class and 


e:##>waw 


is a partial function on some space ¥ = X,X--- x X, C T*. We say that » 
is H-recursive if it is H-recursive (in the sense of 2.2) when we consider it as 
a partial function on T* whose domain happens to be included in 2; this is 
easily equivalent to saying that there is a #-recursive g*: T* > w whose 
restriction to # is y. By the definitions, this means that there is an 
operative functional ®(o, x, f) in XH and integers n,,...,m, such that 


y(x)= O*(n,...,m, x) (x E #). 
These partial functions are the main objects of our study. 
It is also natural to call a partial mapping 
0:23>T 
H-recursive if there is a #H-recursive partial function 
g:T?x#>w 
such that 
(x) = Aa'p(a’, x); 
similarly, 
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O:L>Y=TVXK-- X TH 
is H-recursive if 


(x) = (0:(x)--- 6.(x)), 


with H-recursive 0,,..., 6,. 


11.2. Let %, be the smallest Kleene class on T*. More generally, given a 
finite list of functionals ® = (@,--- ®,) on T*, let Ho[P] be the smallest 
Kleene class of functionals which contains ®,---@,. Instead of Ho- 
recursive we say Kleene recursive and instead of Ho[®]-recursive we say 
Kleene recursive in ®. 

For example the mappings u’*, d,, given in 10.2 are all Kleene recursive. 


12. Kleene recursion relative to objects of higher type and quantifiers 


To illustrate the basic notions, let us consider now some interesting 
examples of higher type recursion. Our discussion here parallels that in 
Sections 4 and 5. 


12.1. Every object 
F='FET (21) 
of higher type can be viewed as a functional on T*, where for j =2 


F(ft{T’) if ff To is total, 
8 8 


undefined otherwise, 
and for j = 1, 


F(x) if x €o, 
F(x) = 
0 if x Zw. 


Thus we can study recursion relative to a fixed higher type object F, 1.e., 
X[F ]-recursion. Of particular importance are the objects ‘E (j = 2) which 
embody existential quantification over T°”, 


0 if da’? (a '(a'~”) = 0), 
‘E(a’"') = 
1 if Va’? (a'""(a'~’) £ 0). 


Our first result here is very useful — it shows that Kleene recursion in a 
higher type object can be defined in terms of (absolute) Kleene recursion 
by substitution. 
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12.2. THEOREM (KLEENE [1959]). Let F ='’F be an object of type j =1. A 
partial function 9 :%— is Kleene recursive in F if and only if there is a 
Kleene recursive partial function » : T” x # — w such that for all x € &, 


o(x) = W(F, x). 


ProoF. In one direction it is enough to associate with each ®(g, y, a’, g) in 
Hy a functional ®*(a, a’, fg) in H[F] such that 


P(o, F,o',g)= D*(o, 0’, Arr'g (1, F,7'), g). 


Because then if ® is operative we can easily prove by induction on é that 
®‘(c,F,0')=w > 6**(0,0', ®*)=w and &**(a,0',8")=w> 
®"(o,F,o')=w so that *(0,F,0')= ®**(c,0',®”) and hence 
®*(a, F, o') is recursive.in F. The construction of ®* is by induction on the 
construction of %> and it is straightforward. For example, if 


P(o, y, o'" g) = g(a, y, a’) 
we take 
®*(0,0',f.8) = f(o, 0") 
while if 
P(o, y,o', 8) = g(a, 0’) 
(so that y does not appear in g) we put 
&*(9, y,0',f, 8) = 8(9,0"). 
On the other hand if (as in 11.1(v)) 
V (a, y,0',8)= y(Aa’ ?P(a'™, a, y, 0", 8)) 
we let 
W*(o,0',f, g)= F(Az®*(z, 0,0’, f, g)). 
For the converse, we prove that if ®(o, f) is a functional in %o[F], then 
there is a functional ®*(y, a, g) in Ho such that 


P(a, Arg (F, r)) = ®*(F, o, 8). 


Then for operative ®, we have by induction on é that ®*(7) = ®**(F,o), 
so ®*(a) = ©**(F, oc) and we are done. The construction of ®* is again by 
induction on the construction of #o[F]. The only interesting case is when 
® = F; then we take 


y(Aa'’g(y,a’”)) if yET’, 
@*(y,g) = 


if yz T, 


cH. C.6, §12] KLEENE RECURSION 113 
provided j =2. If j =1, 


y(x) if yET”, x Ea, 
P*(y,x) = 


0 otherwise. oO 


12.3. Given a quantifier Q on some T”, we associate with it the object of 
type j +2 
0 if Qa’ (a’*'(a')=0), 
1? Fo(a'*!) = ‘ 
1 if Qa’ (a’*"(a') 4 0), 


and the functional on T* 


0 if Qa’ (f(a’) = 0), 
RS(f) = 41 if Qa! (f(a') # 0), 
undefined otherwise. 


In particular if Q = 3’ is the existential quantifier on the objects of type j, 
we get 'E and ‘**E” respectively. As in 5.4, '*’?Fo is easily Kleene 
recursive in /**F4, '’E”. 


12.4. We are mainly interested in studying Kleene recursion relative to 
functionals which concentrate (in the sense of 4.1) on 


* = set of all points type = m, 


for some m. These surely include all the examples we have introduced. 
Suppose © = (@, --- ®,) is a finite list of such functionals. Then it is easy 
to see that all functionals in X[®] concentrate on T%, and moreover 


HP} TL= HAT2)[P I TS, 


where %(T*)[W] is the smallest suitable class of functionals on Tt 
containing all functionals in ¥ which also satisfies conditions (i)-(vi) of 11.1 
(appropriately restricted to T7). 

As in 4.2, we can easily check that if type (2) = 0, then a partial function 
f : %—~w is Kleene recursive (or Kleene recursive in some fixed a € T“) 
exactly when it is recursive (or recursive in @) in the sense of ordinary 
recursion theory. Again, as in 5.3, we can verify that if type (2) =1 and 
RC@, then R is Kleene semirecursive in 7E (or *E”) exactly when R 
is II}. 
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13. Normality and enumeration in higher type recursion 


When we study recursion in functionals ® which concentrate on some 
T*, we do not expect to have normality on all of T* in the sense of Section 
6. We introduce therefore a simple modification of this notion, which will 
cover our present situation. 


13.1. Let ¥% be a class of functionals on a set A, let » C BCA and let ® 
be a functional. We say that ® is normal in ¥ on B if there is an 
 -recursive functional A» such that ®[ B, Ag! B satisfy conditions (i) and 
(ii) of 6.2 (with A replaced by B there). We say that # is normal on B if 
every ® € ¥ isa normal in ¥ on B. Analogously to Proposition 6.4 we now 
have 


13.2. Proposition. Let ® = (®,--- ®,) be a finite list of functionals on T* 
which concentrate on T*. If each ®, is normal in Ho ®,’E,...,"E] on T, 
then X.[®,’E,...,"E] is normal on T%. (If m <1, then we just have X.[®] 
above.) 


ProoF. It is similar to the proof of 6.4. We have to consider the additional 
cases coming from 11.1 of which the only interesting one is 11.1(iv). So 
consider ®(x, y,o,f) and assume we have found Aj that satisfies 6.2, 
working on T% of course. Let then j be such that j+2=™m and put 


_ y(Aa'@(a',y,o,f)) if ye To, 
¥(y, 0, f) = 
0 


otherwise. 
Since W(y, o,f) | <& Va! (@(a‘, y, 0, f) |) or y¢ T’*”, we take 
ane 1+? E(Aa' (1+ Ae(a',y,o,f,6))) if ye To, 
o, f,5) = 
0 ifyzéT”. O 
Call a finite sequence of functionals ® normal on T> if each functional 
in ® is normal in %[®] on T*. Noticing that each object F =’**F 


concentrates on T*, and similarly for quantifiers on T°’ we have the 
following analog of 6.5: 


13.3. THEOREM. (i) Let F ='*?F be an object of type =2; then F,’E,...,'"E 
is normal on T*.2. 
(ii) Let Q be a quantifier on T”; then F%,’E,...,'*’E is normal on T*.2. 
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We will consider the consequences of normality for the structure of 
semirecursive relations especially in the case of recursion relative to 
objects of higher type in Section 17. 

We proceed now to prove the enumeration property in the present 
context. The main fact is the following 


13.4, Proposition. Let X be a Kleene class of functionals on T*. Then X 
admits a coding of tuples on T*. 


Proor. Put 
Fe Pe ae a el ee ek Te Oe 


where as usual 


y, X1,.-.,Xn = concatenation of y, xX1,...,%n = yx OX. 
It is routine to check that X admits this coding. O 


It follows from this proposition and 8.4 that for every finite list of 
functionals ® on T*, the class of partial functions on T* which are Kleene 
recursive in ® has the enumeration property. From this in turn and the 
results of Section 9 we obtain immediately the following theorems. 


13.5. ENUMERATION AND s—m-—n THEOREM. For each finite list ® = 
(P,,..., ®,) of functionals on T* and each space &, we can define a partial 
function 
g* :0xX #> a, 
which is Kleene recursive in ® such that following conditions hold: 
(i) A partial function  : ¥% > w is Kleene recursive in ® exactly when for 
some eGwand all x € &, 


h(x) = pe(x) = p*(e,x) 
(i.e. each y* is universal for Kleene recursion in ®, on &). 


(ii) For each space & of type 0 and each Y there is a total recursive 


function 
7 0xX#>ow 
such that 


ee, x1° ** Xm Yiye+03 Yn) = og” (S3(e, X1,.+-5Xm)s Yis+++5 Yn): 


We will call a system of universal partial functions {p*} with these 
properties good. 
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13.6. THE (SECOND) REcuRSION THEOREM. If {g*} is a good universal 
system, then for each partial function &@:w x X—w which is Kleene 
recursive in ®, there is some e © w such that 


pe(x) = g(x) = ole, x). 


14. The original definition of Kleene 


We give in this section the original definition of recursion in higher types 
given in KLEENE [1959]. In the next section we will establish the equival- 
ence of this original definition with ours. 


14.1. In our terminology, Kleene defines a specific operative functional 
O(e, x, f), where e varies over w, x varies over T* and f varies over partial 
functions on w X T*. Then he calls a partial function g : > recursive if 
for some fixed e € w (an index for gy) and all x E %, 


e(x)= O%(e, x). 


Thus, in effect, he gives a single master recursion and takes as recursive all 
partial functions (on spaces) reducible to it, instead of specifying the form 
of inductions which are allowed. 

The fixed point of @ is indicated by 


{e}(x) = O%(e, x). 


It is easier then to describe O(e, x, f) if we write {e}(x) for O(e, x, f) in the 
left of the equations below and again {e}(x) for f(e, x) in the right of the 
same equations. Since Kleene identifies any two points which contain the 
same objects in the same order within each fixed type (like (n, a, m, a’, B) 
and (n, a, B, a’, m)), we will assume in (i}+(ix) below that all points are in 
simplified form i.e. the objects of type 0 precede those of type 1 etc. Thus 
an expression like (a’,x) where x is a simplified point abbreviates the 
simplified point whose first type j object is a’ and which agrees with x 
otherwise. Then we let {e}(x) = {e}(x*), where x* is the unique point in 
simplified form equivalent to x (e.g. (n, a, m, aw’, B)* = (n, m, a, B, a’)). The 
definition is by cases on the form of e as a sequence code and the arity of x, 
where 


arity(x) = (no, m,..., m,), 


with r = type(x) and for 0 <i <,r, n; = number of type i objects appearing 
in x: 
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(i) {e}(n,x)=n+1 if e =(1, arity(n, x)), 

(ii) {e}(x)=q_ if e = (2, arity(x), q), 

(iti) {e}(n,x)=n_ if e = (3, arity(n, x)), 

(iv) {e}(x) = {e.}({e:}(x),x) if e = (4, arity(x), e1, e2), 
k' ifm=n 

- (v) {e}(k, 44 m,n,x) = if e = (5, arity(k, t, 1, m,n, x)), 

l-t ifmAn 

(vi) {e}(x)={e:}(x:)_ if e = (6, arity(x), j, k, e:) 


where x, contains at least k + 1 objects of type j and x comes from x, by 
moving the (k + 1)-st type j object in x, to the front of the list, 


(vii) {e}(n,a,x)=a(n) if e =(7, arity(n, a, x)) 
(viii) {e}(a',x)= 

=a'(Aa'{e}(a',a’?,x)) if e = (8, arity(a’, x), j,e:) and j =2 
(ix) {e}(n,x,y)={n}(x) if e = (9, arity(n, x, y), arity(x)). 


To each e,x such that {e}(x) = @“(e, x) is defined we associate as usual 
the ordinal 


|e,x|=least € such that @*(e, x) is defined. 


14.2. The reader familiar with KLEENE [1959] has certainly noticed that the 
definition given in 14.1 is not exactly the one given there. The Kleene 
scheme S5 for introducing primitive recursion has been replaced by an 
initial function (v). Correspondingly the Kleene indexing is different than 
the one given in 14.1. It is however well known (and essentially mentioned 
in KLEENE [1959]) that both ways of presenting the definition are basically 
the same in the following strong sense: if {e}'(x) refers to the KLEENE [1959] 
variant, then there are total recursive functions p,q:w-—>»w such that 
{e}(x) = {p(e)}(x) and {e}(x) = {q(e)}(x). The proof is fairly simple for 
anyone familiar with KLEENE [1959] and we will omit it. Here we gave a 
simplified version of the Kleene definition because it is easier to use as a 
technical tool, which we intend to do in the next section. 


14.3. Kleene visualized his inductive definition as leading to a computation 
procedure for {e}(x) which we can explain by an example following 
KLEENE [1959] (see Fig. 1). 
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{e}(F) 
AK {e.}({e.}(F), F) 
“a\(Rn) {k}(F) 
Fig. 1. 


Let e = (4, (0,0, 1), e:, €2) and suppose F = °F is a fixed type 2 object. To 
compute {e}(F) we have to compute {e,}(F) and then, if this is defined, 
{e.}({e,}(F), F). Say e: = (8,(0,0,1),2,d). Then to compute {e,}(F) we 
have to compute {d}(F,n) for all n€@w. Take for example d= 
(2, (1,0, 1), 3). Then for all n, {d}(F, n) = 3. Tracing back the above steps 
we have that 


{e,}(F) = F(An{d}(F, n)) = F(An(3)) = k. 


So we have to compute now {e,}(k, F). Say e2 = (9, (1,0, 1), (0, 0, 1)). Then 
to compute {e.}(k, F) we must compute {k}(F). If this is undefined, so is 
{e}(F). On the other hand if for example k = (2, (0,0, 1), 5), then {k}(F) = 
5 and tracing these steps back we also have {e}(F)=5. 


15. Equivalence of the original Kleene definition with ours 


It is immediate from the definition in Section 14 that every partial 
function 


f[:2->o@ 


which is recursive in the original sense of KLEENE [1959] is also Kleene 
recursive in our sense of 11.2. We will prove here the converse. 


15.1. For each point 

xEL= TVX ++. x TH), 
let 

ch(x) = ch(®@) = (ji- ++ jm) 


be the character of x and for each finite sequence o = (x1,...,X,) from T* 
let 
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ch(o) = (n, ch(x,):- - ch(x,)). 
Let also 


an = nln n 
GF = X1,---, Xn = Xi X2 6°" Xn 


= the concatenation of the points x,,..., Xn. 


Our plan will be to associate with each functional ®(o,f) in Ho a total 
recursive function pe = p: w—w such that 


*(0) = {p(ch(o))}(@). 


This is of course sufficient, because if g : # — w is Kleene recursive in our 
sense, then for some P(r, x, f) € Ho we have y(x) = P(A, x), with fi € w* 
so that 


p(x) = {e}(A, x) 
with 
e = p(ch(w* x &)). 


To défine pe we will have to define simultaneously py for all (the finitely 
many) “‘subfunctionals” ¥ of ®. 


15.2. Notice first that from the definition of a Kleene class we know that a 
functional of the form ®(o, f) is in Xp if it is one of the functionals (i)}-(vii) 
or is generated according to the rules (viii}-(xii) below (unexplained 
notation is as in 2.3 and 11.1). 


0 ifjEew,xET, 
(i) P(j, x, 0, f) = 

1 otherwise. 

x ifxEa, 
(ii) @(x, 0, f) = 

0 ifxéo. 

x+1 ifxE€w 
(iii) P(x,q,f) = 

0 if x Zw. 


0 ifx=yEa, 
(iv) P(x,y,a,f)= 41 ifx,y Eo, xA#y, 
0 otherwise. 


(v) P(r, 0, f) = f(r). 
(vi) P(x, o, f) = length(x). 
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y(x) if yET”, x Ea, 
(vii) P(x, y,0,f) = 


0 otherwise. 
(viii) ¥ (a, f) = ®(X(9, f), a, f). 

(o,f) if x =0 
(ix) W(x, o,f) = {xt f) if x40, x Ea, 

0 otherwise. 

y(Aa'@®(a’,y,a,f)) if ye TY, 
(x) ¥(y, o,f) = 

otherwise. 

(xi) W(o, f) = O(m(o)-+ + m(), f). 
(xii) ¥ (x, y, o,f) = P(t(x, y), o, f), 


with ¢ concatenation or point projection. 

Define the set Sub(®) of subfunctionals of ®(o, f) as follows: 

(a) If ® comes from (i)-(vii) above, then sub(®) = {®}. 

(b) If Y comes from @,.X as in (viii}-(xii) above, then sub(¥)= 
{¥} U Sub(®) U Sub(X). 

Finally let @ < X if ® E Sub(X) & PAX. 


15.3. Proposition. Let ®(a,f) be an operative functional in Hy and let 
®,-++@, be an enumeration of all its subfunctionals such that if B, < ®, 
then i<j and ®, = ®. There is a total recursive function p(i,t) on w such 
that for alli =c, 


®,(7, B*) = {p(i, ch(r))}(P). 


Proor. Let us temporarily call a partial function ¢:2—w Kleene 
recursive if it is recursive in the original sense of Kleene. We will need 
below two simple facts about Kleene, recursion which can be easily 
established from the definitions (or else see KLEENE [1959]): 
(i) If 
eX(e,x)={e}(x) (x 2), 


then {p*} is a good system of universal partial functions for Kleeneo 
recursion. In particular we have the second recursion theorem for Kleene 
recursion. 

(ii) Every partial function on w which is recursive in the sense of 
ordinary recursion theory is also Kleeney recursive. The converse is also 
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true as it follows from 12.4 and the remarks in the beginning of the present 


section. 
We define 
p(i, t) = {P} (i,t) 


by using the second recursion theorem for Kleeney recursion. We consider 
cases according to what @, is. 

First put p(i,t)=0, if i>c. 

For isc, the definition of p(i,t) is straightforward except when ©, 
comes from 15.2(v), i.e., evaluation. In the other cases p(i, t) is either an 
explicitly defined recursive function or it is recursively defined in terms of 
p(j,t') with j <i. For example, in case 15.2(viii) we have 


@, (7, f) = @,(P, (7, f), T, f), 
for j, k <i. Then we have 
®,(1, BD”) = B(D, (7, B”), 7, B”) 
= {p(j,ch(O, 7))} Kp (k, ch(r))} (7), ?) 


= {(4, arity(?), p(k, ch(r)), pj, ch(O, 7)))} (7). 
So we take p(i, t) = (4, q(t), p(k, t), p(j, q2(t))), where q., q2 are recursive 
such that qi(ch(r)) = arity(7), q2(ch(r)) = ch(0, 7). 
Assume now 9; comes from 15.2(v) and to simplify the notation take o 
empty, so that 


®,(r, f) = f(r). 


Then we have 
(7, B°) = P(r) = D(7, B*) 
= &.(1, B”) = {p(c, ch(r))} (7). 


It would seem reasonable to take p(i, t) = p(c, t), but since c > i we cannot 
expect to have p(i, t) defined in this manner. If p is an index of p, however, 
we also have 


{p(c, ch(7))} (7) = {{5}(c, ch(7))} (7) 
= {(9, arity(0, 7), arity(7))} ({5}(c, ch()), 7) 
= {q(B,ch(r))} (7), 
with a recursive function q which is not hard to compute; put then 


p(i, t) = q(p,t) 
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in this case. The proof will hinge on an additional property of this function 
q which we will explain below. 

By the second recursion theorem for Kleene, recursion there is an index 
p such that the function p(i, t) = {p}(i,t) satisfies all the above require- 
ments. Clearly p is total. 

To show that p works we prove first by induction on € that 

&,(1, PB“) = w > {p(i,ch(r))} (7) = w. 

For that within any fixed € we will have to use induction on i. We leave the 
easy details to the reader. 

For the other direction, we prove 


{p(i, ch(r))} (7) = w > B(7, B*) = w 
by induction on the ordinal 
| p(i,ch(z)), PL= & 
(see 14.1) and for each &, by taking.cases on i. 
The only difficult case is when ®, comes from 15.2(v). By the definition 
of p(i,ch(7)) in this case, we have 
{p(i,ch(7))} (P) = {4 (B, ch(z))} (?) 
= {p(c, ch(r))}(#). 
Moreover, any natural choice of the function q has the additional property 
that it increases the ordinal of the computation, i.e. 
| p(i,ch(r)),?| =|q(B,ch(r)), 7| 
= |(9, arity(0, 7), arity(?)), p(c, ch(7)), 7 | 
>|p(c,ch(r)), 7 |; 


this needs checking, but it is not hard and it is the key to this part of the 
proof. It allows us to use the induction hypothesis, by which 


®.(7, B”) = w, 


so that also 
@,(7, ©”) = (7, ®")=w. O8 


16. The Substitution Theorems of Kleene 


16.1. Let g(y, z) be a Kleene recursive partial function and let 0:2 > ¥Y 
be a Kleene recursive partial mapping in the sense of 11.1. We consider the 
question whether 
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x (x, 2) = 9 (0(x), z) 
is Kleene recursive. We can assume without loss of generality that 
Yy = T"*”, so that 
x(x, z) = p(Aa’b(a’, x), z) 


for some Kleene recursive partial function w. 

A counterexample of KLEENE [1959] shows that one cannot expect y to 
be Kleene recursive when j > 0, even if 9 is total. Indeed, let T(m, n) be a 
recursive relation on w such that Wn T(m,n) is not semirecursive (i.e. 
recursively enumerable). Let 
0 if T(m, n), 


¢'(a’, n,m) = | 
undefined if — T(m, 7), 


and let 
p(a?,m) = a*(Ang'(a’, n, m)). 
Then ¢ is Kleene recursive. Let 
0:0>T® 
be any Kleene recursive total mapping. Then 
¢(9(0), m) 


cannot be Kleene recursive since it would then be recursive (in the sense of 
ordinary recursion theory) so 


P(m) © ¢(0(0),m)) ©& WnT (m,n), 


would be semirecursive, which it is not. 
The following partial substitution theorem is very important. 


16.2. THEOREM (KLEENE [(1959]). Let 9: Y x ¥—w be a Kleene recursive 
partial function, let 0: ¥ — ¥ be a Kleene recursive partial mapping and put 
x(x, 2) = p(O(x), Zz); 

then x has a Kleene recursive extension. 

In more detail, there exists a Kleene recursive partial function x *(x, z), 
such that if 0(x) and ¢(@(x),z) are both defined, then yx *(x, z) is defined 
and 

x*(%, Z) = 9(4(x), Z). 


In particular, if both g and @ are total, then x is Kleene recursive. 
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ProoF (Outline). It is enough to assume that Y = T%*” so that 
x(x, z) = p(Aa’p(a’, x), z). 


The proof is a typical example of the ‘‘index transfer method’’ which we 
also used in establishing 15.3. In these arguments, one of the most 
important steps is to formulate the desired result in a strong form which 
admits proof by ‘effective transfinite induction’. In this case, we will 
associate with each j a recursive function p,(é:, €2, t) such that for each pair 
of codes e,, e2 and points x, y, z 


(*) {e:}(y, dai {e2}(a’, x,y, z), z) = {pi(e1, 2, ch(x, y, Zz) Cx, y, Zz); 


whenever the left-hand side is defined. 

The construction of p; is by induction on j, so fix some j > 0 and assume 
that po,...,p;-1 have been defined and (*) holds for each of them. (The 
construction of po is similar to that of p; (j >0) and we will omit it.) 

We will define 

P(e, C2, t) = pi(er, C2, t) 


using the recursion theorem, in terms of its index p. The definition is by 
cases on the index e,, corresponding to 14.1(i)-(ix). All the cases except 
(viii), (ix) are straightforward since in them one defines p(e:, e2, t) explicitly 
(and recursively) either absolutely or in terms of p(ei, e2, t’) with e1< e. 
Take now case 14.1(viii). If t =ch(x,y,z) for points x,y,z with y 
containing some type j +1 object, the definition of p(e:, e2,t) is again 
straightforward. Otherwise we can incorporate y, z in a new sequence so 
that without any loss of generality we can assume that y is empty and 


e, = (8, arity(a’*', z), j +1, (e1)s) 
so that 
{e}(a!"', z) = a! "(Aa {(er)s}(al, a! z)). 
Fix x and z and let 
B'*' = Aa’ {e2}(a’, x, z); 


then easily 
{e:}(B’*’, z) = {e2} (Aa™'{(e1)s} (B"'” ai, z), x, Z)). 


Now 


Bi" = Aa! {q(e2)}(a’, x, a'"', Zz) 
with a recursive q, so that 


{(e,)s}(B'*’, ait) z) = {(e1)s} (Aa! {q(e2)}(a’, x att z); al), a 
= {p((e1)s, q(e2), ch(x, a’~', z))}(x, a!7!, z) 
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assuming that p is defined and has the required properties on 
(e:)s, q(€2), ch(x, a’~', z); we will argue this when we come to proving that 
our definition works, but notice that |(e,):,A@‘{q(e)}(a@’, x,a@'',z), 
a’ ',z)|<le, Bz]. 

To simplify notation, let 


d = p((é:)s, q(e2), ch(x, at, z)), 
so that we have 
{e,}(B’*', z) = {e.} (Aa! "{d} (x, a’, z), x, Z). 


At this point we have reduced the problem of computing {e:}(B’*', z) toa 
lower type substitution, i.e. plugging Aa’~'{d}(x, a’~', z) into {e2}(a@’, x, z). 
This is not quite in the right form to apply p;-1, but introducing a recursive r 
(as we did with q above), we let 


{d}(x, a’"',z) = {r(d)}(a’", x, z) 
and we have 


{e}(B'", z) = {e2}(Aa’'{r(d)}(a’™, x, z), x, z) 


= {pj-s(e2, r(d), ch(x, z))} (x, 2). 
Finally we set 
p(éi, €2, t) = p;-s(e2, r(d), ch(x, Z)), 


where clearly ch(x, z) = ¢. 

In Case 14.1(ix) we defined p(e:, e2, f) in terms of an index p of p as in 
the proof of 15.3. We omit the details. 

The proof that p is total is exactly as in 15.3. 

To show that p satisfies (*), one checks by induction on 


le, y, Aa! {e2}(a’, x, y, z), z I =, é 
and within each & by taking cases on eé,, that 
(** ) {ei} (y, Aa! {e.}(a’, x, y, z), z) =Ww > {p(e1, C2, ch(x, y, z))} (x, ys z) =wW. 
Again the only difficult case is 14.1(viii) and the argument in that case can 


be extracted easily from the analysis we made above. 0 


One cannot prove the converse of (**) for j >0, even assuming that 
Aa! {e2}(a', x, y, z) is total, because of the counterexample we gave in 16.1. 
For j = 0, however, one can verify that 


{p(e., 2, ch(x, y, z))}(x, y,z) = w > {ei}(y, An{e.}(n, x, y,Z),2)=w 


by induction on | p(eé:, 2, ch(x, y, Z)), x, y, Z|, so that we have: 
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16.3. THEOREM (KLEENE [1959]). If type (Y)=1, p:¥xZ—>w is a 
Kleene recursive partial function and 0: % — Y is a Kleene recursive partial 
mapping, then there is a Kleene recursive x *(x, z) such that whenever 6(x) is 
defined, 


x *(x, Zz) = p(O(x), z). 


In particular, if 9 is total, then p(0(x), z) is Kleene recursive. 


Again, by a counterexample similar to the one in 16.1 one can see that 
the hypothesis that 6(x) is defined is needed above. 

Our final result in this section shows that full substitution as in 16.3 is 
permissible on any type provided an object of sufficiently high type is 
present. This fact will be quite useful in the sequel. 


16.4. THEOREM (Kleene). Let 9: ¥Y X ¥—w be a Kleene recursive partial 
function, let 0:2 —¥Y be a Kleene recursive partial mapping and put 


x(x, 2) = e(O(x), Z). 


For each m=type(Y), there is a Kleene recursive. partial function 
x *(a™, x, z) such that for all objects a™ of type m and all x such that 6(x) is 
defined, we have 


x(x, z)= x *(a™, x, Z). 


In particular, if @ is total, then xy is recursive in every object of type 
= type(Y). 


Proor (Outline). Assume without loss of generality that ¥Y = T’*”, so that 
0(x) = Aa’ (a’, x). Consider first the case when m =] + 1. 

Using the ‘‘index transfer method” again we show that there is a total 
recursive function p such that if A@’/s(a@’, x) is total, then for each a™ 


(*) {e}(y, Aa’h(a’, x), z) = {p(e, ch(x, y, z)}(@™, x, y, Z). 


The definition of p follows the same pattern as in the proof of 16.2, but 
we do not need induction on j since we can now use 16.2. 

As in the proof of that result, consider only the interesting subcase of 
case 14.1(viii), when e is such that 


{e}(Aa'h(a’, x), z) = W(Aa’'{(e,)s} (Aa’b(a’, x), al, Zz); x); 


if p is defined and has the desired properties on (e,)s, then this expression is 
equal to 


cH. C.6, §16] THE SUBSTITUTION THEOREMS OF KLEENE 727 
b (Aa {p((e:)s, ch(a', 2, x))Ha™, a!" 2.x). x)= 


= p(Aa’ {d}(a", a’ ', z,x), x), 
where 
d = p((e:)s, ch(a’™', z, x)). 


Using 16.2, we can find a total recursive function h such that 
(**) (Aa! {d}(a@™,a'', z,x),x) = {h(d, ch(x, z))}(@"™, x, Zz) 


whenever 
f = Aa’ d}(a",a!',z,x) 


is total and the left-hand side of (**) is defined. It remains only to modify 

the right-hand side of (**) so that it is defined only when f is a total 

function and it is here that the presence of the argument a™ is used. 
Find a recursive total function h’ such that 


ee ifa"(f)l, 
{h'(d, ch(x, z))}(a”™, x, z) = 
undefined otherwise 


and put in this case 

p(e,t)= h'(p((e1)s, g(t), #) 
where q is recursive and such that 

q(ch(x, z)) = ch(a@’™', z, x). 


Note here that h’ can be chosen so that |h’(d,ch(x, z)),@7,x,z|> 
|d,a™,a'"',z,x| for all a’ '. 
With this definition it is not hard to verify that 
{e}(y, Aa’p(a', x), z)=w > {p(e,ch(x, y, z)}(a@™, x,y,z) = w 


by induction on |e, y,Aa'i(a@’,x),z| and the converse implication by 
induction on |p(e,ch(x, y, z)), a”, x, y,z|, both under the hypothesis that 
Aa’ds(a’, x) is total. 

To prove now the general case when 


m2=jtl 
it is enough to establish the following: 


Lemma. For each Kleene recursive partial function p(a’,x) the partial 
function 


wal", x) = p(d(a’"'), x) 
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is also Kleene recursive. (Here d, is the type decreasing mapping of 10.2.) 


ProoF. We use induction on j. For each fixed j we apply again the ‘index 
transfer method’’. The troublesome case 14.1(viii) is handled easily, 
noticing that if 
g(a',x)=a' (Aa! x(a, a’, x)), 
then 
p(d,(a’*’), x) = dj(a’*')(Aa! x(a’, a’, x)) 

= a! "(u' (Aa! x(a’, a”, x))) 

= a! *"(AB''x(a', d-(B""'), x)) 
so that one can easily use the induction hypothesis on j. 


This completes the proof of the lemma and our outline of the proof of 
16.4. 0 


16.5. As a simple corollary of those substitution theorems we can now 
obtain the following facts about relative Kleene recursion. For any two 
objects F, G, let F =< G mean that F is Kleene recursive in G, i.e., for some 
Kleene recursive partial function g and all a’, 


F(a’) = 9(G,a’). 
By 16.2 = is a transitive relation. This allows us to define the 
equivalence relation 
F=GOFsSG&GEF. 
If F =G we say that F, G have the same Kleene degree. If F = G, then 


every total function which is Kleene recursive in F is also Kleene recursive 
in G. This also holds for partial functions (by 16.4) provided only that 


type(F) = type(G). 


17. Sections and envelopes 


17.1. A pointset of type k >0 is a subset R C 2 of a space of type k —1. 
For example, the pointsets of type 1 are the subsets of w", for some n. If 
R C®@ is a pointset, we write interchangeably 


xERS R(x). 


A k-pointclass is a collection of pointsets of type = k. Given a finite list of 
functionals ®, the k-envelope of ®, in symbols 
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xen(P), 


is the k-pointclass consisting of those pointsets of type =k which are 
Kleene semirecursive in 9, i.e., domains of Kleene recursive in ® partial 
functions. The k-section of ®, in symbols 


sc(P), 


is the k -pointclass consisting of the pointsets of type = k which are Kleene 
recursive in @, i.e., their characteristic functions are Kleene recursive in ®. 
For example, it follows from 12.4 that, for k = 1,2, 


,en(E) = all IT) pointsets of type =k, 
ssc’E) = all A} pointsets of type =k. 


We will devote the rest of this chapter to the study of properties of 
envelopes and sections of higher type objects. In general, no interesting 
results can be stated about the structure of ,en(F), unless F is normal on 
Tt... By Theorem 13.3(i), if F =(F,-::F,) and m = max{type(F): i = 
1,...,n}=2, then F’E---"E is normal on T%*, so we will restrict 
ourselves to the study of wen(F,’E ---"E) for k =m +1. Note here that 
‘E ="E, when j =m since 'E = d,,,("E) (in the notation of 10.2) so that 
en(F,’E ---"E) =,en(F,”E). Moreover we can code all objects in F by a 
single object G of type m so that ,en(F, "E) = .en(G, "E). This is done by 
first lifting to type m (using the u*™) all the objects in F and then using a 
simple pairing function for type m objects like (Fi, F2)(a™"')= 
(F\(a”~'), Fx(a”~')). So from now on we will study 


xen(F, "E) 


for type(F)=m =2 and k=m+1. We summarize below the basic 
properties of these envelopes which follow more or less immediately from 
what we have already proved. 


17.2. Let I be a k-pointclass. We say that I” is closed under 3’ if for each 
RC#xT" in I, the pointset 


P(x) © da’ R(x, a’) 


is also in I. Closure under V’ and —, v, a is defined in the same way. We 
say that I” is closed under substitution by total Kleene recursive mappings if 
it contains all the Kleene recursive pointsets of type =k and whenever 
6:2#-—>¥Y is Kleene recursive and total with type(2), type(Y¥)< k and 
R(y, z) is in FT, then so is 
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P(x,z) @ R(@(x), z). 


We say that I has the enumeration property or is w- parametrized if for 
each # of type =k there is some WCw x2 in I such that for each 


RC&, 
REI & de (R= W.), 


where W, ={x: W(e,x)}. Finally, we say that I is normed or has the 
prewellordering property if every R € I’ admits a I’-norm (see 7.1, for the 
definition). 

We now have the following result whose most important assertion, 
namely prewellordering and closure under 3°, are due to Ganpy [1967] for 
m = 2, and PLaTEK [1966] and MoscnHovakis [1967] for m = 3 (the proof in 
MoscuHovakIs [1967] is given only for m = 3 and Gri.uiot [1967] extended 
it to all m). 


17.3. THEOREM. Let F =F be an object of type m =2. Let’ =,en(F,E), 
for i1sk=<=m+1. Then 

(i) I’ is closed under total Kleene recursive substitutions, », v, 3°, W' 
(i =m —2), is normed and has the enumeration property. 

(ii) A function f : ¥ — w, where type (#)<k, is Kleene recursive in F,"E 
if and only if graph(f)€ TI. In particular a pointset R C & of type =k is 
Kleene recursive in F, "E if and only if both R and # — R are in T. 


Proor. Closure under total Kleene recursive substitution follows from 
16.4. Closure under a, v, 3° follows from 13.3, 13.4, 9.4 and 7.3. For 
closure under V’ (j < m — 2) note that if g(x, a’) is a partial function, then 


Va! (p(x,a') |) @'?E(Aa'g(x,a’)) 1. 


Normality and enumeration follow from 13.3 and 13.5. Finally (ii) follows 
asin 9.4. OQ 


18. Inductive analysis of semirecursive sets 


We give here an inductive analysis of the pointsets in ,en("F, "E) which 
turns out to be the key to many of the further structural properties of 
envelopes. Some of its immediate applications to the problem of closure of 
envelopes under higher existential quantification will be given in the next 
section. 
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18.1. An operator 
N : power(#)— power(#) 


on & is monotone if 
ACBC#HDSNMA)CN(B). 


Given such an operator we define inductively 
NF =N(Q<*), where QO~§ = LV 2", 
n<& 


and we let 2” = U, 2 = least fixed point of 2. 
IfRC#,SCY are two pointsets we say that R is reducible to S if there 
is a total Kleene recursive 0:2 —¥Y such that R(x) © S(@(x)). 


18.2. THEOREM (MoscHovakis [1967]). Let F =F be an object of type 
m =2 and letm-1sk=sm+1. Every P €,en(F,"E) is reducible to the 
fixed point Q” of some monotone operator 2 ona space of type <k, which 
has the form 


xEN(S)& Vy (R(x,y) > yES) 
with R € ,en(F, "E). 
ProoF (outline). To simplify the notation let G = (F,”"E). Then ,en(G) = 
,en(F,"E), so we work with G below. We treat only the case k =m—1 
since the other cases are similar and a little easier. It will also be convenient 
to work with the single standard space # = T“"~” of type m —2 and for 


that we code finite sequences of objects of type = m — 2 by single objects 
of type m —2 as follows: If a’ is an object of type =m —2 let 


a’ =ul™ 7a’). 
If a+ ++ a@,-1 iS a sequence of objects of type =m —2 let 
(a0 @n-1) = Aa™ “(Gola ) + + &a-(a™™))). 
The decoding functions are now given as follows: for any object a” * put 
(a™*), = Aa™ “(Ca *(a™*)):) 
and for any j =m —2 let 
(ai = dj m-2((a” *);). 
Then, if j =type(a;), we have 
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(Kao sees Q@n-1)); = aj. 


Clearly all these coding and decoding functions are total Kleene recursive. 
(The above definitions were given for m > 2. We leave to the reader the 
trivial modifications needed if m =2.) Following KLEENE [1959] and 
GriLuioT [1967] let us abbreviate 


{e}{a™ “,a™] = {e}((a™ Mb, (aay «(a hay ees 


m—2\t m—2\m-2 
(a@ Jkatkity ++ +5 (@ kot---+km-39 °° 9 


Cac re fen eer 2a 8 


if e is an index of a partial function having k; arguments of type i = m — 2 
and one argument of type m. It is easy to see that for some Kleene 
recursive 9, 
{e}[a"",a™] = p(e,a™ a”). 
Let 
W = {(e,a™’): {e}[a” ’, G] is defined}. 


It is clearly enough to show that W = 2” for an 2 as in the statement of 
the theorem. 

We view each (e, a” *) as coding the computation {e}[a” *, G] accord- 
ing to the Kleene definition in Section 14. As in 14.3, to compute 
{e}[a™~*, G] we have to compute its subcomputations which all have the 
form {e'}[B” ?, G] and therefore are coded by various (e', B”~’)’s. For 
example, if e is an index and (e)) = 4 then the subcomputations of (e, a”) 
are ((e)2,a@™”°) and (letting 1 = ko + kit+--++km-s) 


((e)s, ({(e)}[a™’, G], 
fa? ala” Naame” OP ce ae” a a) 


provided that {(e).}[a”~*, G] is defined. 
Put 


R((e,a” *), (d, B”~’)) & (e is not an index of the appropriate form and 
e=danda”™’= 8B” *)v ((d,B” )isa 
subcomputation of (e, a”~’)). 

Then R is Kleefie semirecursive in G. One can see this by noticing that R is 
defined by cases on what the index e is. In all cases except when (e))=4, R 


is actually Kleene recursive in G. In case (e))=4 and e accepts the right 
type of arguments then 
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R((e,a""”), (4, B"”)) @ 
& (d=(e) & B™*=a"”)v 
(d= (e)s & An ({(e).}[e"*, G] =n 
& B72 = (ny, (079, 0 (Mant oe CO, 6 (Oem at) 
so it is Kleene semirecursive in G. Now letting 
(e, a" *)E A(S) & Vd, B™ [R((e,a"*), (4, 8") > (4 BYES] 


we have that W = 2” and we are done. DJ 


19. Closure under higher existential quantification 


As a simple corollary of the representation theorem in 18.2 we now 
have: 


19.1. THEOREM (MoscHovakis [1967]). Let F="F be an object of type 
m =3. If R € »-.en(F, “E), then there is some P in,,-,en(F, "E) such that 
3 R(x) @ da"? P(x, ae’). 
Proor. Let 0:%— Y be total Kleene recursive and 2 an operator on Y 
as in 18.2 such that R(x) © *(6(x)). Say 
y EAS) & Vz (O(y,z) > 2 ES), 
with O €,,-,en(F, "E). Notice then that 


1 2"(y) & Ayo, Yr, Yo" (Yo = Y AWE Q(Y Yir1))s 
so that 
7 R(x) ed Ayo, yicce (Yo = 6(x) A ViQ(y, yi+1))- 


To complete the proof we only have to code yo, y:,... by a single element 
of Y. For that, assume without loss of generality that Y = T“"~” and let 


g(a", i)=[a” 7], = Aa" a" “((i,a””)). 
Then ¢ is total Kleene recursive and 
R(x) @ 3a"? (a Jo = O(x) AVIQ(a™ 7], [a" 7Ji-1)), 
so we are done. (1 


The previous result holds as well for R € »+.en(F, "E) by a modification 
of the preceding argument (see Moscnovakis [1967}) but we will have no 
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use for this stronger version. Notice also that even if m = 2, the above 
proof shows that if for example R Cw is in ,en(F,*E), then for some 
Q(m,n) such that Q € ,en(F,’E), 


= R(x) © da (a(0) =x a ViQ(a(i), a(i + 1))). 


In view of 19.1 we now have: 


19.2, CoroLttary (MoscHovakis [1967]). Let F ="F be an object of type 
m =3 and letk =m —1; then,en(F,"E) is not closed under E eal 


As an application of 19.1 we see that for any m = 3 every pointset of type 
<m-—1 semirecursive in "E is recursive in "E™, since ,,-,en("E”) is 
obviously closed under 3”~’. In particular, 


mien("E)S,-en("E*) for m =3. 


The negative upper bound 19.2 for the problem of closure of envelopes 
under existential quantification turns out to be optimal in view of the 
following important result. This was announced in GriLLioT [1969b] but 
was first proved correctly in HARRINGTON and MACQUEEN [1976]. 


19.3. THEOREM (GRILLIOT [1969b], HARRINGTON and MACQUEEN [1976]). 
Let F =""F be an object of type m =2. Then,,_,en(F,"E) is closed under 3 
for each l<m—2. 


The proof of 19.3 is rather lengthy and will not be given here. One of its 
basic ingredients is again the use of the representation theorem 18.2. 


20. A guide to the literature 


We have presented an exposition of the elementary parts of the theory of 
recursion in higher types. For the reader who wishes to pursue the study of 
the subject beyond this, we suggest here some further reading. We will try 
to cover mainly those aspects of higher type recursion that have been 
introduced in this chapter. This necessarily leaves out some quite interest- 
ing topics of current research, for example the theory of continuous objects 
and functionals. 

For the foundations of recursion in higher types the reader should 
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consult KLEENE [1959, 1963] and PLATEK [1966] as well as Ganpy [1967], 
where the basic concepts of stage comparison and selection are introduced 
in the subject. 

Much of the early work in the theory was concerned with the construc- 
tion and study of various hierarchies for the sections of higher type objects 
in the spirit of the hyperarithmetical hierarchy. Relevant references here 
are KLeEENE [1963], TuGuE [1960], GANDY [1967], SHOENFIELD [1968], 
MoscuHovaklis [1967], GRriLLiot [1969a], HINMAN [1969], and more recently 
WAINER [1974, 1975]. 

Later on, Sacks initiated a deeper study of sections, concentrating 
particularly on the effect of the type of an object F on the structure of its 
sections. He also studied the analogues for Kleene degrees of many 
well-known problems from the theory of degrees of unsolvability, like 
Post’s problem. References here include Sacks [1971, 1974, 1975] (where 
the Plus-1 Theorem is proved), MACQUEEN [1972], HARRINGTON [1973], and 
Sacks [1976]. See also Grituiot [1971]. 

Still later, the study of the structure of envelopes became another 
important area of research in higher type recursion. To find out about this 
recent work the reader can consult MoscHovakis [1974b], HARRINGTON 
[1973] (where the Plus-2 Theorem is proved), KEcuris [1973], HARRINGTON 
and Kecuris [1975], and NorMANN [1974]. 

There are few results in the literature about recursion in specific higher 
type objects, other than those we have already mentioned in this chapter, 
e.g. Fa, F% for various Q’s. An exception is the superjump which has been 
studied extensively; put 


{ if {e}(a*y 1, 
*S(e,a’) = 
1 if {e}(a’)f, 


(Ganpvy [1967]). Although *E#S, recursion in S has many interesting 
features; see GANDY [1967], PLATEK [1971], Aczet and Hinman [1974], 
HARRINGTON [1973, 1974a] where the generalization of the superjump to 
higher than type 3 is also studied. Lately, in still unpublished work, 
HarrinctTon [1974b, 1975] has introduced some particularly interesting 
new examples of normal functionals whose study looks very promising. 

Perhaps the most important open problem of the theory is conceptual, 
namely to relate higher type recursion with other work in the foundations 
and to find its proper place within definability theory. There are leads in 
this direction in the papers we have cited above, but the general question is 
still wide open. 
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Introduction 


Inductive definitions of sets are often informally presented by giving 
some rules for generating elements of the set and then adding that an 
object is to be in the set only if it has been generated according to the rules. 
An equivalent formulation is to characterise the set as the smallest set 
closed under the rules. 

Of course the basic example of an inductive definition is the one 
generating the natural numbers. But it has long proved a useful device 
when presenting the syntax of a formal language. Further examples of its 
informal use may be found in logic and other branches of mathematics. 
Post [1943] realised that the finitary inductive definitions used in present- 
ing the syntax of any standard formal system could all be put in a canonical 
form, and the general class of such inductions could be fruitfully studied in 
abstraction from any specific formal system. Since then inductive defini- 
tions have played an important role in the development of ordinary 
recursion theory and its generalisations. Recent work has tended to present 
the theory of inductive definitions in abstraction from the original motivat- 
ing intuitions. Our aim here is to give an introduction to the subject which 
will connect the informal examples with the recent formulation in terms of 
iterations of monotone operators. We have in mind a reader familiar with 
the concept of a formal system and with the elements of ordinary recursion 
theory. 

Most of our exposition will be concerned with monotone induction and 
its role in extensions of recursion theory. But in 3.5 we review some of the 
work on non-monotone induction, and outline there the separate motiva- 
tion that has led to its development. In Section 4 we briefly consider 
inductive definitions in a more general context. For a detailed development 
of the theory of positive induction see MoscHovakis [1974a]. Several 
papers on inductive definability may be found in FENstaD and HINMAN 
[1974]. These include a survey, GANpy [1974] and the papers AANDERAA 
(1974], CeNzeR [1974], and RicuTeR and AczeL [1974]. MoscHovakis 
[1976] gives an abstract algebraic approach to the general theory that 
applies to both monotone and non-monotone induction and also to 
recursion in higher types. 
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1. What are inductive definitions? 


1.1. Inductive definitions as generalized formal systems 

Inductive definitions are used repeatedly when logicians describe the 
syntax of their languages. For example the terms of a first order language 
are defined to be the smallest set of expressions containing the variables 
and constants and closed under the term formation rule: 

If %,...,¢, are terms and f is an n-ary function symbol of the language 
then the expression f(t,...,f.) is a term. 

Similarily the formulae of a first order language are defined to be the 
smallest set of expressions containing the atomic formulae that are closed 
under the various formulae formation rules for the logical symbols —, v, 
A, —, Sx, Wx. 

For our purpose, the most useful example to consider will be the 
definition of the class of theorems of a formal system. Consider the 
Hilbert-style system H for first order logic used in Chapter A.1. The set 
Th(H) of theorems of H is there defined in terms of a notion of “‘proof” for 
H. But Th(H) may also be characterized as the smallest set of formulae, 
containing the axioms, that is closed under the rules of inference. Each 
instance of a rule of inference has the form: 


(*) From the premisses @ for @ € X, infer the conclusion w. 


In case of modus ponens X consists of two premisses g and (¢ > w). 
Instances of the generalization rule only have one premiss. It is convenient 
to consider an axiom scheme as a special form of rule of inference where in 
each instance the set of premisses is empty. Using this convention the 
formal system H determines a set ®y of pairs (X, y) such that (*) is an 
instance of a rule of inference of H. Then Th(H) is simply the smallest set 
closed under (*) for (X, #) © ®u. 
Generalizing we obtain the following definitions. 


1.1.1. DEFINITION. (i) A rule is a pair (X, x) where X is a set, called the set 
of premisses and x is the conclusion. The rule will usually be written X — x. 

(ii) If ® is a set of rules (also called a rule set below), then a set A is 
®-closed if each rule in ® whose premisses are in A also has its conclusion 
in A. We shall write ® : X — x to denote that the rule X > x isin ®. SoA 
is ®-closed if ©: Xx & X CA implies x EA. 

(iii) If @ is a rule set, then I(®), the set inductively defined by ®, is given 
by 1(@)=Y{A]A is -closed}. 


742 ACZEL/ AN INTRODUCTION TO INDUCTIVE DEFINITIONS [cH. C.7,, §1 


Note. ®-closed sets exist; e.g. the set of conclusions of rules in ®. Also, the 
intersection of any collection of ®-closed sets is @-closed. In particular 
I(®) is ®-closed and hence I(®) is the smallest ®-closed set. 


Returning to our example we see that Th(H) = I(x). Similarly, rule sets 
can easily be found for inductively defining the sets of terms and formulae 
of a first order language. 


Note. What is usually called a rule of inference or formation rule corre- 
sponds to what we have called a rule set. It is the instances of rules of 
inference or formation rules that correspond to what we have called a rule. 


Perhaps the most familiar example of an inductive definition in 
mathematics is the one that characterizes the set of natural numbers 
w ={0,1,2,...} as the smallest set containing 0 and closed under the 
successor function, i.e., w = I(®,,) where ®, consists of the rule 6— 0 and 
the rules {n}—> n + 1 for n € ow. This characterization justifies the principle 
of mathematical induction: If P is a property that holds of 0 and holds of 
n+1 whenever it holds of n, then Y holds for all natural numbers, i.e. 
{n € w|P(n)} is B,-closed implies w C {n € w | P(n)}. 

Generalizing, we see that to each rule set ® there is a principle of 
®-induction: If P is a property, such that whenever ®: X—x and 
Vy © XP(y) then A(x), then A(a) holds for every a € I(®). 

The above principle is the natural tool to use in proving properties 
of I(®). 


1.1.2. ExampLe. To show that every theorem of H is universally valid in 
every structure, it suffices to show that for each structure 2% 
{e(v1,...,0a)| MIE Wo.---Won.e(vi- ++ vn)} is Py-closed. 


So far all our inductive definitions have been finitary, in the sense that 
each rule has only finitely many premisses. For such ® we can generalize 
the standard notion of ‘‘proof’’ for formal systems such as H. 


1.1.3. DEFINITION. do,...,@, is a (finite length) ®-proof of b if 

(i) a, = b, 

(ii) for all m <n there is an X C {a;|i< m} such that ®: X > a,,. 
1.1.4. Proposition. For finitary ®, 


I(@) ={b | b has a ®-proof}. 
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To show that every b € I(®) has a ®-proof it suffices to show that the 
right hand side is ®-closed and use @-induction. For the converse 
direction if ao,..., a, is a ®-proof it suffices to show by induction on m =n 
that a, € I(®). 


1.2. The well-founded part of a relation 

Let < bea binary relation on a set A. The well-founded part of < is the 
set W(<) of a€ A such that there is no infinite descending sequence 
a>a>a,>:::. The relation < is a well-founded relation if A = 
W(<). W(<) can be inductively defined as follows. Let ®. be the set of 
rules (< a)— a for a € A, where (<a)={xEA |x <a}. 


1.2.1. PRoposition. W(<)= I(®.). 


Proor. To show that I(®.) C W(<) it suffices to show that W(<) is 
@.-closed and use @.-induction. So assume (<a)C W(<). Now if 
a>a>a,>°::, then avE(<a)CW(<). But as ap>a,>:::, 
ao € W( <) which gives a contradiction. Hence ao © W(<). 

Conversely, to show W(<)CI(®.) let a¢ I(P.). We shall find a > 
ao > a,>--+-+ showing that a¢é W(<). Asa ¢ I(®.), then (<a) Z I(®-). 
Hence there is an ao<a such that ap € I(®.). Repeating we can find 
a, < a such that a, € I(®.). Repeating indefinitely we obtain a > a)> 
a, See oO 


Note that a form of the axiom of choice is needed (the axiom of 
dependent choices). 

The principle of ®.-induction, becomes, when < is well-founded, the 
principle of transfinite induction along a well-founded relation. Associated 
with transfinite induction is the method of definition by transfinite recur- 
sion. This enables one to define a unique function f on W(<) so that for 
a € W(<), f(a) is defined in terms of f(x) for x < a. The uniqueness and 
existence of such f can be justified by suitable instances of transfinite 
induction. As an example we may assign to each a € W(<) an ordinal 
|a|< so that 


|a|<=Sup{|x|<+1]x <a}. 
The ordinal | < | of the well-founded part of < is defined as 


| < | =Sup{|a|.+1|a€ W(<)}. 
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An inductive definition can often be rephrased in the form ®. for a 
suitable <. 


1.2.2. DEFINITION. The rule set ® is deterministic if 
®:X,7x & P:X,>x implies X,= X2. 


1.2.3. ExampLe. ®. is always deterministic. So is ®,. Also the rule sets 
defining the terms and formulae of first order logic are. 
Note that Py is not deterministic. 


Now let @ be deterministic and let A be the set of conclusions of rules in 
®. For x,y EA let x < y if 6: X — y for some set X such that x © X and 
X CA. Then @®. is the set of rules X > x in ® such that X C A. Hence 
we get 


1.2.4. Proposition. For deterministic ®, 
I(®) = 1(@-). 


For deterministic ®, functions on I(@) can be defined by recursion on 
the way objects in I(®) are generated, as in transfinite recursion. An 
example of this from syntax is the operation of substitution. Given an 
individual variable v and a term ¢, the function assigning to each formula 
y(v) the formula ¢(t) obtained by substituting ¢ for all free occurrences of 
v in g(v) is naturally defined by a recursion on the way a formula ¢(v) is 
generated. 


1.3. Inductive definitions as operators 

Let ¢ : Pow(A)— Pow(A) where Pow(A) denotes the set of all subsets 
of A. The operator g is monotone if X C Y CA implies (X)C 9(Y). 
Given g let ®, be the set of rules X > x such that X CA and x € ¢(X). 
For monotone g, X CA is ®,-closed just in case p(X) CX. So I(®,) = 
MUXCcA | e(X)C X}. Hence it is natural to extend the terminology 
concerning inductive definitions to monotone operators ¢ and we write 
I(g) for {x CA | e(X)C X} and call it the set inductively defined by 9. 
All inductive definitions can be obtained using monotone operators. For if 
® isarule set on A (i.e. X U{x}C A whenever ® : X > x) we may define 
a monotone operator ¢ : Pow(A)— Pow(A) by 


e(Y)={xEA|O:X>x forsome X C Y} for YCA. 
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Then Y CA is ®-closed just in case p(Y)C Y so that I(®) = I(¢). 

For monotone operators ¢ there is a useful alternative characterization 
of I(g) using transfinite iterations g* of g for ordinals A. Define g* CA 
by transfinite recursion on the ordinal A so that 

e* =U e*Ue(U eo). 
BSA <a 

Also define ¢* = U,g* where A ranges over all ordinals. 

If we write p~* for U,-, 9", then 


ep =~" Ug(e™). 


a 


The sets p~* may be directly defined by the transfinite recursion 


e= U e(¢e™), 


<a 


or alternatively by 


<A+l 


We may then define y* = 


Note. The literature often uses the notation g* for what we have called 
y<*. We have adopted the notation initiated in MoscHovakis [1974a]. 


Because X CI(¢) implies p(X) C I(¢), a transfinite induction shows 
that »* CI(@) for all ordinals A. Hence if we let g”= U,@’ then 


e CI(¢). 
As » <A implies g““ Cgy~* CA, and A is a set there must be an 
ordinal A such that g<**'= @~*. It follows that y* = g* = »™ for all 


A2=A so that p*= 9. Hence 9(~”)= 9(y~™)Cg* =o”. Hence by 
g-induction I(¢)C g”. Also for monotone g, 4% <A implies g(g~")C 
¢(¢~*) so that p“* = U,-.¢(¢") C o(e**) and hence ¢* = g(g*"). It 
follows that p* = y(¢”), ~” is a fixed point of ¢ (in fact the least one). Thus 
we have proved: 


1.3.1. PROPOSITION. ‘For monotone ¢ : Pow(A)— Pow(A), 
(i) I(g) =~", and 
(ii) I(g) is the least fixed point of ¢. 
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The definition of y” above does not require the operator g to be 
monotone. Hence it is natural to extend the notion of an inductive 
definition to non-monotone operators by calling y* the set inductively 
defined by ¢ for any operator ¢ : Pow(A)— Pow(A). It has turned out, 
perhaps rather suprisingly, that the theory of non-monotone inductive 
definitions is as rich and interesting as the theory for monotone operators, 
even though naturally occurring examples of non-monotone induction are 
harder to come by. Perhaps their main motivation can be seen in terms of 
systems of notations for ordinals. Associated with any operator 
gy :Pow(A)— Pow(A) is a function | |, mapping ¢” into an initial 
segment of ordinals, given by 


|a|, =least A such thata€ gy’ fora €o”. 


Let |~|=Sup{|a|, + 1]a € @*}. Then ” is a set of notations for the 
ordinals < |g| via the mapping| |, : ¢”—|q|. (Note that we follow the 
standard convention of identifying an ordinal with its set of predecessors.) 
The ordinal may also be characterized as the least ordinal A such that 

x en Hence °° = gi!= ple! 


1.3.2. ExampLe. Let < bea binary relation on the set A and let ¢ be the 
monotone operator corresponding to the rule set ®., so that W(<)= 
I(~) = o”. Then it is easily seen that g* ={a € W(<)||a|, <A}, so that 
lal, =|a|< for a€ W(<) and |p| =| <|. 


An interesting general problem connected with an_ operator 
y : Pow(A)— Pow(A) is to characterize or estimate the ordinal |@|. As 
| |,:¢7—|@|isasurjection and y®C A the cardinality of |g | must be <= 
the cardinality of A. 

For monotone operators a better bound can often be found. If « is a 
cardinal let us say that ¢ is x-based if: 


x€&gy(X) implies xE& g(Y) forsome Y C X of cardinality <x. 


1.3.3. Exampce. If ® is a finitary rule set on A, then the monotone 
operator g corresponding to it is w-based. In general if every set of 
premisses of a rule in ® has cardinality <«, then @ is «-based. 


1.3.4. Proposition. Let g be a x-based monotone operator where « is 
regular. Then |p|=x, so that I(¢)= ¢". 
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Proor. It suffices to show that g“ Cy“. So let x Ep“ = —(y~“). Then 
x € g(X) for some X Cy of cardinality <«. By the regularity of «x, 
X Cg’ for some A <x, so that x € p(y ")=¢* Cy™ as required. O 


1.3.5. Examp.e. If ® is a finitary rule set with corresponding monotone 
operator g, then |g|<w and I(y) = 9** = U,-.¢*" where g~°=@ and 


<n+l 


y = 9(¢"") forn<a. 


1.4. Concepts of ‘‘proof’’ for monotone induction 
In 1.1.3 we formulated a notion of finite length proof appropriate for 
finitary rule sets. We now consider a more general notion. 


1.4.1. DeFinition. Let g be a monotone operator on A. A transfinite 
sequence {a,},<, is a g-proof of b with length d if 

(i) a = b. 

(ii) a, € o(fa, | w < v}) for all v SA. 


As in Proposition 1.1.4 we get: 


1.4.2. PRoposiTIoN. (i) For any regular cardinal x, p<" ={a€ Alahasa 
y-proof of length <k}. 
(ii) (eg) ={a € A | a has a ¢-proof}. 


It is sometimes convenient to use an alternative notion of proof that uses 
well-founded trees instead of transfinite sequences. An example is the 
notion of derivation for the Gentzen style system G of Chapter A.1. There 
the appropriate rule set is finitary so that the well-founded trees are 
actually finite. 


1.4.3. DEFINITION. A (well-founded) tree T is a set of finite sequences of 
length >0O such that 

(i) There is exactly one sequence of length one in T. It is called the root 
(ar) of the tree. 

(ii) If (ai,..., Qn+1) © T, then (a),...,a@,) € T. 

(iii) T is well-founded in the sense that there is no infinite sequence 
a), @2,... such that (a@,,...,@,) © T for all n > 0. Alternatively, the relation 
<r is well-founded, where 


(a1,..-,@n)<r(b,...,5,) iff n=m+1 


and a; = b; for i=1---m. 
Define the length |T| of T to be |(ar)|<,. 
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1.4.4. DeFinition. If ® is a rule set, a tree T is a ®-proof of a if a = ar 
and ®: Tw,...a,)— Gn, whenever (a:,...,@,)€ T, where 


T inci = 1) (Gisscweae aye TV). 
1.4.5. Proposition. (i) I(®) = {a | a has a tree ®-proof}. 


(ii) If ® is a rule set on A with corresponding monotone operator 
y :Pow(A)— Pow(A), then for all ordinals A, 


g* ={a€ A|ahas a tree ®-proof of length =X}. 


ProorF. (i) follows easily from (ii). (ii) will be proved by induction on A. Let 
X* denote the right-hand side of (ii) and let X** = U,,., X“. By induction 
hypothesis y“* = X**. 

Let a € X*. Then a hasa ®-proof T with] T|< A. For each x € T,q) let 


n=0 & (a,x,%,...,%) € T}. 


T* = {(%,%1,.--, Xn) 


Then T* is a ®-proof of x with |T*|<A. Hence Ty,)G X** = @**. As 
®: Tia a it follows that a € o(p~*) = ¢%. 

Conversely, let a € y*. Then a € g(X**). For each x € X~* let T* bea 
®-proof of x with |T*|< A. Let 


T = {(a)} U ((a, x, X1,...,%n)(n ZO Rx EX & (x, 41,..., 4.) € Th. 
Then T is a ®-proof of a with |T|<A, so thatae X*. O 


1.5. Monotone induction and games 

For those familiar with the elementary concepts associated with games 
we give a game-theoretic characterization of I(®) for an arbitrary rule 
set @. 

For each a we define a game G(@, a) between two players I and II who 
move alternatively when possible. The play starts by I] choosing a, = a. If 
after n pairs of moves player II chooses a, then player I must respond by 
choosing a set X, such that ®: X,— a, and then II must respond by 
choosing a,., € X,. If either player cannot move then he loses. If the game 
continues indefinitely, then player I loses. 


1.5.1. PRoposiTIon. a € I(®) iff player | has a winning strategy in the game 
G(®, a). 


Proor. Let W be the set of those a such that the right-hand side holds. Let 
® : X — a with X C W. For each x € X let o, be a winning strategy for I 
in G(®,x). Define the following strategy o for I in G(®, a). I starts by 
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playing X and then if II chooses x © X then I continues by using the 
strategy ox. o is clearly a winning strategy. Hence a € W. Thus W is 
@®-closed so that by ®-induction I(®) C W. 

For the converse let a € W. Let o be a winning strategy for lin G(9, a). 
Let T be the set of possible finite sequences of moves of II when I follows 
a. Then observe that T is a tree ®-proof of a so that by Proposition 1.4.5, 
a€l(®). Thu WcC1(®). O 


1.6. Kernels — the dual of an inductive definition 

Sometimes inductive definitions present themselves more naturally in a 
dual form. 

If ® is a rule set let us say that a set X is ®-dense if for every x EX 
there is a set YC X such that ©: Y— x. Define the kernel K(®)= 
U{x | X is &-dense}. K(®) itself is ®-dense and is the largest @-dense 
set. If ® is a rule set on A and g : Pow(A)— Pow(A) is the monotone 
operator associated with ® then X CA is ®-dense iff X C p(X). Hence 
K(®)=U {X CA|X C @(X)}and we shall define K(g) =U{X CA|XC 
gy(X)} for any monotone g. To make explicit the duality between the 
kernel construction and induction define the dual of an operator ¢ to be 
the operator ¢ given by ¢(X)=—9(7X) where 7X =A-—X for 
X CA. Then X CA is o-dense iff 4X is g-closed, so that K(¢)= 
— I(¢). It follows that K(g) can be defined in terms of transfinite 
iterations ¢"!=—4¢", as K(g) = |, ¢”! where ¢"! = 9(N,-. 9). 

An example of the above is the Cantor-Bendixson construction in 
general topology. Let E be a subset of a topological space. Let © be the set 
of rules Xx such that X CE and x €E is a limit point of X. The 
corresponding monotone operator ¢ : Pow(E)-—> Pow(E) is the closure 
operation on E. A set X C E is closed in E just in case X is ®-closed and 
is dense in itself just in case it is @-dense. Thus K(¢) is the largest dense in 
itself subset of E, called the kernel K of E. When E is a closed subset of a 
space with a countable basis then E = K US where K is perfect and 
S = — K = I(¢) is a countable set, so that the closure ordinal | ¢ | must be 
countable. This is the Cantor-Bendixson representation of a closed subset 
of a space with a countable basis. 

Another example of the kernel construction comes from the theory of 
abelian p-groups. These are abelian groups where every element has finite 
order p” for some n, where p is a fixed prime. Let G be an abelian 
p-group. Define g : Pow(G)— Pow(G) by 

Pp 


=a 
o(X) = pX ={g+---+g|gEX} for X CG. 
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Then ¢ is a monotone operator mapping subgroups of G to subgroups of 
G. The g-dense subgroups of G are just those that are said to be divisible, 
so that K(¢) is the largest divisible subgroup of G. Much of the structure 
theory of abelian p-groups is concerned with the descending hierarchy of 
subgroups {¢")},. 


1.7. Some examples of induction in classical mathematics 

(1) If X is a subset of a group G then there is a smallest subgroup H of 
G that contains X. H is inductively defined by the set of rules §— x for 
x © X U{e} and {a,b} ab™' for a, b € G. The same notion is used with 
other algebraic structures such as rings, fields and vector spaces. A slightly 
different sort of example is the algebraic closure of a subfield of an 
algebraically closed field. All these examples involve finitary rule sets. 

(2) If R isa binary relation on a set A, then the transitive closure of R is 
the smallest transitive relation extending R. It is inductively defined by the 
set of rules §—>(a,b) if aRb and {(a, b),(b,c)}— (a,c) for a,bhc EA. 
Similarily the equivalence relation generated by R is inductively defined by 
the above rules together with the rules §—(a,a) for a€A and 
{(a, b)} > (b, a) for a,bE A. 

(3) For an example of a non-finitary inductive definition we turn to 
o-rings and Borel sets. Recall that a set © C Pow(A) is o-ring if it is closed 
under complements and unions of countable subsets; i.e., combining these 
into one € is a o-ring if @ is ®-closed where ® consists of the rules 
{A, |n€w}—> U,-. 4A, for countable families {A,},c. with A, C A. 
Hence the a-ring generated by €, C Pow(A) is inductively defined by the 
tule set ®’ consisting of the rules in ® together with the rules 8 X for 
X € €. As an example the Borel sets of reals are the special case where 
A =R and &, is the set of open subsets of R. If ¢ : Pow(A )— Pow(A)) is 
the monotone operator associated with ®’ then ¢ is N.-based so that by 
Proposition 1.3.4, |g|<N,. The stages g* for A <¥&, are just the familiar 
stages of the Borel hierarchy. y° is the set of open sets and for A > 0, the 
sets in gy’ are those of the form U,-.— A, where each A, € 9%. 


2. Induction in recursion theory 


2.1. Recursively enumerable relations 
There are two key results relating the recursively enumerable (r.e.) 
relations to the finitely presented formal systems such as formal arithmetic: 
(1) The theorems of a finitely presented formal system form an r.e. set, 
when Gédel numbered. 
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(II) Every r.e. relation can be represented in any sufficiently rich formal 
system. 

These results are important in Gédel’s incompleteness theorem. The 
general notion of representability will be considered in the next section. 
Here we wish to give a result concerning inductive definitions that yields I. 

As we have seen, a formal system will determine a rule set that 
inductively defines the set of theorems. When expressions are Gédel 
numbered, a rule set on w is induced that inductively defines the set of 
Godel numbers of theorems. If the formal system is finitely presented the 
rule set will be a recursive finitary one as defined below. 


2.1.1. Derinition. Let @ be a finitary rule set on w. ® is recursive (r.e.) if 
the relation R» is recursive (r.e.) where Re» is the set of pairs 
((a;,...,4n),b) such that ®:{a,,...,@,}— b. 


Here ( ): U,e..@"—w is a standard constructive injective coding 
function for finite sequences of natural numbers. Associated with it are 
recursive functions lh:w—w and q:#Xw-—w and they satisfy the 
following: 

(i) The range Seq of (_ ) is recursive. 

(ii) For each n >0 the function ( )[@":w"—~@ is recursive. 

(iii) Ih((x1,...,x,))=n for n=0. 

(iv) q((X1,---,%), i= x, for 1 Si=n. 


2.1.2. Proposition. If ® is an r.e. finitary rule set on w, then I(®) is re. 


Proor. As @® is finitary, by Proposition 1.1.4, 
I(@) ={a € w| Ay Pre(a, y)} 


where Pra(a,b) iff b=(a,,...,a,) for some ®-proof a,,...,a, of b. 
Hence to prove the proposition it suffices to show that Pre is r.e. But this ts 
just a matter of coding: 


Pre (a, b)  Seq(b) & q(b, Ih(b)) = a 
& Wi <Ih(b) Az [Ro(z, q(b,i + 1)) 
& Wi <ih(z)) 3k <i(q(z,i+1)=q(x%,k + 1))). 


By hypothesis, using standard closure properties of the r.e. relations we see 
that the right-hand side is re. O 
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2.2. TI; relations 

When infinitary rules of inference are allowed in a formal system then 
more general notions of induction are required, e.g. in GRZEGORCZYK, 
Mostowski and Rytt-NARDzeEwski [1958] the w-rule is added to an 
otherwise finitely presented formal system for second order arithmetic. 
The w-rule allows one to infer Vx g(x) from an infinite set of premisses 
vy (0), ¢(1),.... When Géddel numbered, such a formal system induces a 
regular arithmetical rule set ® on w. 


2.2.1. DEFINITION. A rule set ® on w is regular arithmetical if there are 
arithmetical relations R and S such that @ is the set of rules R, > b such 
that S(a,b) where R, ={y Ew | R(a, y)}. 


2.2.2. Proposition. (i) If ® is a regular arithmetical rule set on w, then the 
associated monotone operator ¢ : Pow(w)— Pow(w) is Ili (i.e, {((X%, x) € 
Pow(w) X w |x € g(X)} is I). 

(ii) For any monotone II} @ the set I(~) is also Mt. 


Proor. (i) Let @ be the set of rules R, — b such that S(a, b) where R, S 
are arithmetical. Then the associated monotone operator ¢ is given by 
o(X)={bEw |da [S(a,b) & R. C X]} for X CA. ¢ is arithmetical and 
hence IT}. 

(ii) (eg) ={a € w | WX [Wx [x € o(X) > x © X] >a € XJ} so that if » 
is Il standard quantifier manipulations show that I(g) is also IT}. O 


Hence when considering systems with the w-rule the above proposition 
suggests that the class of r.e. relations in I and II of 2.1. should be replaced 
by the class of IT; relations. 

Below we give two further examples of regular arithmetical rule sets. 

The first example occurs in the definition of Kleene’s system of notations 
for the recursive ordinals. This may be given as follows. Let < be the 
smallest transitive relation on w such that 

(i) a <2° for a€a, 

(ii) {e}(n) <3-5* for e,n © w such that {e}(n) is defined. (Here {e} is 
the e-th partial recursive function in a standard enumeration.) Then < is 
easily seen to be an r.e. relation. Now let ® be the set of rules 01, 
{a}— 2° for a © w, and {{e}(n)| n<w}—>3-5* for e € w such that {e}(n) 
is defined and {e}(n) < {e}(n + 1) for all n G@w. Then let O= I(®). Let 
a <b iffa,b ECO &a < b. As @ is regular arithmetical, © and hence <p» are 
Il}. For a €@ let |a|o=|a|,. Then |1|o=0, |2* |o=|a@lo+1 for a EO and 
[3+ 5° lo=limye. |{e}(n)|o for 3: 5° €O. Thus (6, <o,|  |o) is Kleene’s recur- 
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sive analogue of the countable ordinals. The ordinal |g|= 
Sup{|a|o+1|a€O}is the Church-Kleene ordinal «,, the first admissible 
ordinal > w. 

As another example of a regular arithmetical induction we consider the 
hyperarithmetical hierarchy as formulated in MoscHovakis [1974a]. This is 
a recursive analogue of the Borel hierarchy. Call a set @ C Pow(w) an 
effective a-ring if B effectively contains the r.e. sets and is effectively 
closed under complementation and countable unions. I.e., there is a set 
ICw and sets B, Cw for i€ I such that B ={B {iE} and: 

(i) There is a recursive function 7,:@— TI such that R. = B,,.) for all 
e €w. (Recall that R, is the e-th r.e. set in a standard enumeration.) 

(ii) There is a recursive function 72: @-—>w such that if {e} is a total 
function f:w— J, then 

T(e)ET and Bie = U AB. 

An effective o-ring may be constructed as follows. Let 7:(n) = 2". 
Let 7,(e)=3-5*°. Let ® be the set of rules 9—7,(n) for n €w and 
{f(n)|n € w}— 7.(e) for e € w such that {e} is a totally defined function f. 
Let I[=1(®). Then @ is a deterministic rule set (see 1.2) so that by 
recursion we may define B. Cw for e EI by 


B.w)=R. fore€a, 
Boy = U AB if {e}=fiwrl 
nE&w 


Then clearly @ ={B;, |i € J} is an effective o-ring. 

As @ is regular arithmetical, I = I(®) is Ij. Let g be the monotone 
operator associated which ®. @ can be arranged in a hierarchy @ = 
U,<., B* where B* ={B, |e € y*} for A < w, (=|¢]|). So B° is the set of 
r.e. sets, B' is the set of ¥2 sets, U,-. B" is the set of arithmetical sets. In 
Section 8E of MoscHovakis [1974a] there is a proof of the following result. 


2.2.3. THEOREM. & is the smallest effective a-ring and coincides with the set 
of At subsets of w. 


2.3. Representability 

In this subsection we generalize the approach to the recursively enumer- 
able and II; relations on w in terms of representability to arbitrary 
structures. 

Suppose that we have a set A and a theory T that has individual 


754 ACZEL/ AN INTRODUCTION TO INDUCTIVE DEFINITIONS (cu. C.7, §2 


constants for elements of A as well as individual variables. (The same 
symbol will be used for a constant and the object it names.) 


2.3.1. Derinition. RCA" is T-represented by the formula 0(x) if x = 
X1,...,X, and 


R(@)@ Tt 0@(aé) fora@eEA". 


If A = w and T isa finitely presented theory such as formal arithmetic, 
then the set 'T’ of Gddel numbers '6! of theorems @ of T forms an r.e. set 
and hence every T-representable relation is also r.e. For if R is T- 
represented by 6(x), then R(@) & f(a) €'T' for 4d € w", where f is the 
recursive function given by f(a) = '0(a)' for a € w". Result II of 2.1 gives a 
converse of this result for sufficiently rich T. Hence for suitable systems T, 
the T-representable relations are exactly the r.e. relations. 

Ordinary recursion theory can be developed from scratch by making a 
suitable choice of T. For example Post’s canonical systems (see Post 
[1943]) make one such choice, which is further refined in SMULLYAN [1961]. 
Kleene’s systems of equations for representing the partial recursive 
functions (see KLEENE [1952]) give another approach. 

In Grzecorczyk, Mostowski and RyLL-NARDZEWSKI [1958] it is shown 
that in second order formal arithmetic with the w-rule exactly the Ij 
relations are representable. Hence the notion of representability gives a 
way of uniformly treating the r.e. relations and the I; relations. Below we 
give such a uniform treatment for arbitrary structures % that gives these 
classes of relations on w when Y is the structure Jt = (w, S, P) of arithmetic 
where S and P are the graphs of addition and multiplication. 

First we need to give some definitions. Given a set A we introduce the 
full first order language L* over A. L* has individual constants for the 
elements of A and related constants for the relations on A. There are also 
individual variables and n-ary relation variables for each n > 0. 

The elementary (i.e. first order) formulae of L* are built up in the usual 
way using the connectives and the individual quantifiers Vx, dx. The 
second order formulae are obtained by allowing quantifiers VX", 3X" 
where X” is an n-place relation variable. All elementary or second order 
sentences of L* are either true or false in the standard interpretation. We 
shall be interested in various subclasses of formulae. The existential 
formulae are built up from atomic formulae and their negations using v, A 
and dx. Given a binary relation < on A we may introduce the restricted 
quantifiers Vx < y, dx <y abbreviating Vx (x <y— and Ax (x<ya. 
Then we may define the restricted elementary formulae as those built up 
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using only restricted quantifiers. We may define the =% and IT), prenex 
formulae, where we allow the matrix to be restricted, in the usual way, e.g. 
5 formulae have the form 3x,--- 3x, Vy:::: Vy, 6, where n,m =0 and 6 
is restricted. 

We will also be interested in classifications of second order formulae. A 
formula is II} if it has the form VX,---WX,,0 where m =0 and @ is 
elementary. Similarily we define the I], and 2, formulae for n>0, by 
counting the number of alternating blocks of relation quantifiers. 

Given a structure % = (A, R,,..., R,) L(QQU) is the sublanguage of L* that 
only allows relation constants for equality and the relations Ri,...,R. lf ¥ 
is a collection of formulae of L*, then RCA” is ¥-definable over 1 if 
there is a formula 6(x) of LQQ0) in ¥ such that ¥ = x,,..., X, includes all the 
free variables of @(x) and 


R(a) © 0(a@)istrue fora@E A”. 


Many constructions in ordinary recursion theory make use of some 
coding apparatus, e.g. in Gédel numbering. To extend such constructions 
to °f we shall need such apparatus to be definable on I in a suitable way. 


2.3.2. DEFINITION. A coding scheme for A is-a triple € =(N, =,¢_ )) 
where: 

(i) NCA and & isa binary relation on N such that (N, =)=(w, =). 
We shall identify N with w = {0,1,...}. 

(ii) ( ): U,e. A" >A is an injective function. Associated with € are 
the following. 

(iii) Seq, the set of codes of finite sequences is the range of (_ ). 

(iv) Ih: Seq— N is given by Ih((x,,..., Xn )) = n. 

(v) q:Seq x N—A is given by 


x iflsisna, 


q((X1,.--5Xn)s b) = | 


0 otherwise. 


(vi) s: NN is given by s(n)=n+1. 

If F is a class of formulae we say that € is ¥-definable over Nif N, S, 
Seq and the graphs of Ih,q and s are ¥-definable over YW. 

% is ¥-acceptable if there is a coding scheme © over A that is 
¥-definable over Yl. In case ¥ is the class of elementary formulae we just 
write acceptable for #-acceptable. 

Given a coding scheme © over A anda structure 2, formulae 6 of LD) 
can be assigned “‘“Gédel numbers” '@'€ A in a standard way. We shall not 
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go into the details of this here but shall occasionally need to make use of 
certain facts about such a Gédel numbering. 

Given a structure % let T(2) be the formal system, in the first order 
language L(%), that has a standard system of axioms and rules of inference 
for first order logic with equality and has the diagram of % as a set of 
non-logical axioms. So every true sentence of L(), that is atomic or the 
negation of an atomic sentence, is an axiom of T(2). 

Our generalization of the class of r.e. relations on w is the class of 
T()-representable relations on A. For this to be a good notion we shall 
require that % is existentially acceptable with an existentially definable 
coding scheme ©. Let < be the relation on A that is the strict ordering 
relation of the copy of the natural numbers on A given by @. Let 21 (2) be 
the class of relations on A that are =? definable over %, where the above 
< relation is used in defining restricted formulae. As on w, we can show 
that the finitary rule set inductively defining the theorems of A, induces a 
Li(2M) finitary rule set on A. 


2.3.3. DEFINITION. A finitary rule set ® on A is %$(%!) if the relation Re is 
LIQ) where Re is the set of pairs ((a,,...,a,),b) such that 
® :{a,,...,a,s 2b. 


The following is proved exactly as Proposition 2.2.2. 


2.3.4. Proposition. If 2 is existentially acceptable and ® is a X} (1) finitary 
rule set, then I(®) is also Y° (21). 


2.3.5. CoroLiary. If % is existentially acceptable, then 'T(%)' is LIQ) 
and hence every T(%l)-respresentable relation is X{ (2). 


To prove the last part we need the fact that if @(x) is a formula and 
f(a) ='0(a)' for @ € A”, then the graph of f is in 2{(2). The converse to 
this will be given in the next section. 

Let us now generalize the w-rule. The system 7.(2%) is obtained from 
T(2Y) by adding the following infinitary rule 


A-rule: From @(a) for a EA infer Vx (x). 


Let us call an elementary formula g(X.,...,Xm%1°*'%n) of LQ) 
universally true if the II} sentence 
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is true. It is easily seen that the axioms of T.(2) are all universally true and 
all the rules of inference of T.(2[) preserve universal truth. Hence we have: 


2.3.6. PRoposITION. T.(2{)' @ implies ¢ is universally true. 


In case % is countable a completeness theorem holds that gives the 
converse to Proposition 2.3.6. It can be proved using the omitting types 
theorem for countable first order languages (see CHANG and KEISLER 
[1973]) or else it can be proved by a direct Henkin construction, as in 
Grituiot [1974]. 


2.3.7. THEOREM. For countable %, 


TU) ge iff is universally valid. 


2.3.8. CoroLLary. For countable % a relation on A is T.(%)-representable 
iff it is T1{(Q) (ie. M\-definable over %). 


Proor. By Theorem 2.3.7, 0(x) T.(2)-represents R iff VX,---VX,, 0(x) 
defines R, where X,,...,X, are the relation variables occurring in the 
elementary formula 6(x). This result includes the special case of arithmetic 
when AM=7N. OF 


As before, if we assume given a coding scheme € over Y%, the rule set 
inductively defining the theorems of T.(21) when Gédel numbered induces 
a rule set on A that has the following property when © is elementary 
over YW. 


2.3.9. DEFINITION. A rule set ® on A is regular elementary over I if there 
are elementary relations R, S such that @ is the set of rules R, > b such 
that S(a, b). 


2.3.10. Proposition. If % is an acceptable structure, then there is a regular 
elementary rule set ® such that 'T.(%)! = I(®). 


This result will be needed in the next section. 

Finally we consider the notion of truth for (elementary) sentences of 
L(1). It is easily seen that every true sentence can be proved in T.() and 
hence we have the characterization. If g is a sentence of L(Q) @ is true iff 
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T.Q0)t g. Alternatively we may give the usual inductive definition for 
truth that follows the way ¢ ts built up. This can be expressed as follows. 
Let us call expressions of the form + g or — g where ¢ is an sentence of 
LQ, labelled sentences. Now let ® be the following set of rules on 
labelled sentences. 


f#— +6 for each true atomic sentence 6, 


@-—>- 6 for each false atomic sentence 6; 


{+o, +W}>+(ge aw) for sentences g, yf, 
{-gy}— —(¢ aw) _ for sentences g, y, 
{-w}—> -—(¢ aw) for sentences g, w; 
similar rules for 4, v, >; 
{+ ¢(a)|aeG A}—> +Wxg(x) for sentence Vx g(x), 
{—g(a)}}> —VWxg(x) foraeA; 


and similar rules for x. 
Then if ~ is a sentence, 


gy istrue iff +pEl(®), 
gy isfalse iff —pEl(®). 


Note again, that when Gédel numbered, using an elementary coding 
scheme, the above rule set ® induces a regular elementary rule set on A. 


3. Classes of inductive definitions 


3.1. The general framework 

Much recent work on inductive definitions falls under the following 
general framework. Assume given an infinite set A. Let @ be a class of 
operators, each of the form g : Pow(A")— Pow(A") for some n>0. 


3.1.1. DEFINITION. (i) RCA" is @-inductive if there is g : Pow(A™) 
— Pow(A”) in @ with m =n and b€ A™™ such that for dE A’, 
R(a) © (b,a)E¢”. 


(ii) IND(@) is the set of @-inductive relations. 
(iii) | @| = Sup{|¢|]¢ © F}. 
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Why not define IND(@) to be {p*| gy € €}? This turns out not to be 
natural, as we will want IND(@) to have certain closure properties, e.g. we 
generally want IND(@) to be closed under intersections and there seem to 
be no reasonable conditions on @ that will ensure this for {y*| gy € S}. We 
give a general result below that gives closure properties of IND(@) given 
suitable assumptions on @. 

Call 7: A" A™ a section map if m =n and for some bE A™™", 


7(a@)=(b,a) forallae A’. 
Then IND(€)={r'¢*|¢ € € & + is a section map}. 


Section maps are used to code several inductive definitions into one. The 
following is the key to this. 


3.1.2. Lemma. Let n,,...,n. >0 and m =max(ny,.... n,)+1. Then there 
are section maps T;: A"—>A™,...,% :A™—>A™ that have pairwise dis- 
joint ranges. 


Proor. Choose pairwise distinct elements c,,...,c, © A and define 


m— Nn; 


_——— 
7(@)=(c,...,.,@a)E A" for@dEA™ O 


3.1.3. DEFINITION. An operator 
9: Pow(A™) xX +--+ x Pow(A™)— Pow(A") 


is section codable in € if for section maps 7;,:A"—>A”,...,™ i: A™—2A™, 
7T:A"—>A"™ ~ €€ where ¢ : Pow(A™)— Pow(A”) is given by 


e(S)= 70(71'S,..., 72'S) for SCA”. 


3.1.4. PRoposition. If (i) @ is closed under unions (i.e. o,f © € implies 
e UWE, where op UP(S)= e(S)UW(S)), 

(ii) every operator in € is section codable in @, 
then IND(&) is closed under every operator @:Pow(A™)X---x 
Pow(A ™)— Pow(A”") that is section codable in € and is monotone in each 
argument. 


Proor. Let @ be as above, section codable in @ and monotone in each 
argument. Let R, = 0;'g7 where o,:A"%—A™ is a section map and 
gi : Pow(A™)— Pow(A™) is in @ for i=1,...,k. We wish to show that 
0(Ri,..., Rk) = 7 '~” for some section map r and some g € @. 


760 ACZEL/AN INTRODUCTION TO INDUCTIVE DEFINITIONS [cu. C.7, §3 


Let m=max(m,...,m.,n)+1, and choose section maps 
Ti A™MOA",..., RIAA”, Tr: A"—>A™ with pairwise disjoint 
ranges. Let p(S)=71.9,(7;'S) for i=1,...,k and SCA™. Let 6(S)= 
70((7101) 'S,...,(%ox.)'S) for S$ CA™. Finally let g(S)= @1(S)U---U 
yi(S)U 0S) for S C A”. By assumption (ii) each yg} € @. As @ is section 
codable in @, 6’€ @. Hence by assumption (i) g € @. 

Now it is easy to see that g7=7;'g” so that R, =(r0;)'e” for 
i=1,...,k. Also, r7'y* = 0((110,) 'o~,..., (tox) 'e~*). Hence as @ is 
monotone, 


-1 7m -1 A 
T QO = T 


v 
= U A((7101) 'e~’," sees (110% y'e*) 


= O((T101) 'e*, tee 5 (TO% y'e”) 
6(R,,...,Rx). O 


Note. Monotonicity of @ is essential in the above theorem. In general 
IND(@) will not be closed under complementation. 


3.1.5. ExAmpLe. The basic first order monotone operators are v”, A”, 3” 
and V" for n >0. 
v"(R,S)=RUS_ for R, SCA’, 
A"(R,S)=ROAS_ for R, SCA"; 
3" (R)={@E A" |AxR(x,a)} forR CA", 
VW" (R)={4@E A" |WxR(x,a)} for RCA". 


Usually the class @ of operators is specified by definability conditions. 

Let ~(X, x) be a formula of L* having free at most the n-ary relation 
variable X and the individual variables * = x,,...,x,. We say that p(X, x) 
defines the operator y : Pow(A")— Pow(A") if 


g(S)={4@E A"|9(S, 4) is true} for SCA”. 


Given aclass ¥ of formulae of L* anda structure Xf = (A, Ri,..., R,) let 
FM) denote the class of operators definable by a formula of ¥ in LQ). 
Let mon- ¥(%) denote the subclass of monotone operators in ¥(%). Finally 
let pos- ¥ (2%) denote the class of operators definable by a formula ¢(X, £) 
of ¥ in LQQQ in which X occurs only positively, i.e. ¢(X, X) is built up from 
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formulae not involving X and atomic formulae involving X using only the 
positive connectives v, a and the quantifiers. Operators in pos-¥(Y) are 
automatically monotone. So we have 


pos- F(A) C mon- F(A) C FM) 
and hence 
IND(pos- F(%)) C IND(mon- F(21)) C IND(F(M)) 


and |pos- F(%)| = |mon-F(M)! = | FW). 


3.2. Positive existential induction 

In this subsection we look at perhaps the simplest example of the general 
framework, i.e. we consider the class of operators pos-¥A(%), where F is 
the class of existential formulae. We shall write IND(4 — 2%) and |3 — | for 
IND(pos- F(2Q)) and | pos- F(%)|, when ¥ is this class. We will see that for 
existentially acceptable 9 the class IND(A4— 2) coincides with the T(%)- 
representable relations and gives a good generalization of the r.e. relations 
on w. 


3.2.1. Proposition. If % is infinite, then 


|4A-W| =. 


Proor. |4— % 


2, as |g, |=n for each n € w where 


i 


e(X) = {xE Al V X(a)ax =a} for X CA, 
where a,...,@,-, are pairwise distinct elements of A. Clearly each ¢, is 
positive existential, p!, = {a, lj <i} for i <n and hence I(¢) = {a, | j <n}. 
To show that |4— %| <w, by Proposition 1.3.4 it suffices to show that 
each positive existential operator g is w-based. For this it suffices to show 
that, for each existential formula @(X) positive in the relation variable X 
and containing no other free variables, 6(R) is true implies @(S) is true for 
some finite SC R. This is easily proved by induction on the way such 
formulae @(X) are built up. 
The next result gives closure properties for IND(3 — %). 


3.2.2. PROPOSITION. (i) If @:Pow(A")xX---X Pow(A™)— Pow(A") is 
positive existential over “{ (or equivalently is section codable in the class 
of positive existential operators), then IND(3 — 1) is closed under 0. Hence 
the relations =, Ri,...,R, and their complements are IND(a - 2%), as 
the appropriate constant operators are positive existential over %. Also 
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IND(3— %) is closed under v", a", 3" for n >0 as these are positive 
existential over “I. 

(ii) If NCA, < isarelationon A, and S :N— WN such that (N, <,S)= 
{w, <,S) and N, < and the graph of S are existentially definable over %, 
then IND(J4- 1) ts W2-closed for n >0 where 


W(R)={(a, BNE A" |aEN& Vx(x <a—>R(x,b))} forR CA". 
ProoF. (i) This is an application of Proposition 3.1.4. 

(ii) Let R=7 '~* where +:A"*'>A™ is a section map and 
g :Pow(A™)— Pow(A™) is positive existential over %. We show that 
W"(R) is in IND(A-%). First identify N with w ={0,1,...}. Let 
71:A"™—>A™* be the section maps 7,(¥)=(ix%) for ¥€A™ where 
i=0,1. Let W(X) = top (t0'X) U (717) 0 ((t07) |X, (17) |X) for X CA”, 
where 


x =0v Ay (x = s(y) & Y(y, x) & Z(y, X))} 


O(Y,Z)={(4, e)E An" 


for Y,ZCA""'. 


Then w& is positive existential over YW. Clearly 7o'w*= 9%. So R= 
(tor) 'W*. Hence if S = (7,7) "'w*, then S(x, ¥) @ x =Oor Fy (x = sly) & 
R(y, xX) & S(y,x)). This can only hold if S=VW2(R). Hence V2(R)= 
(7,7) 'w is in IND(A-%. O 


3.2.3. PRoposiTioNn. If g@ : Pow(A”™)— Pow(A”™) is positive existential over 
M, then I(g) is T(QU)-represented by W(X)—> X(xX), where W(X) is 
V¥ (p(X, ¥) > X(y)) where e(X,x) is an existential formula of L(%) 
positive in X that defines the operator. 


Proor. Let T be the relation represented by #(X)— X(x). First note that 
if dG & T, then by Proposition 2.3.6, w(X)— X(@) is universally valid, i.e. 
aE lM{SCA™|ye(S)CS}=M(¢). Thus TC I(v). To show I(g)C T it 
suffices to show that ¢(T)C T. We need the following. 


Claim. Let 6(X) be an existential formula of L(Y) containing x only 
positively and containing no other variables free. Then 


0(T)istrue implies TQ) &(X)— 0(X). 


This claim is proved by an easy induction on the number of logical 
symbols in @(X). 
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Now if @€ g(T), then g(T,4) is true, and hence by the claim 
TOD+t &(X)— e(x, a). Recalling that w(X) is Vy (e(X, y) > X(¥)) it 
follows that TQ w(X)— X(a) and hence @ ET. Thus g(T)CT as 
required. O 


3.2.4. THEOREM. Let I be an existentially acceptable structure. Then the 
following are equivalent for a relation R on A. 
(i) R EIND(A - %), 
(ii) R is T(21)-representable, 
(iii) R is &9(%)-definable. 


Note. In (iii), =!(20) is defined relative to the copy (N, <) of (w, <) where 
€ =(N, <,( )) is an existentially definable coding scheme over Y. It 
follows from the theorem that {(%) is essentially independent of the 
coding scheme © used. 


Proor. (i)— (ii). Let R = 7'y” where 7 is a section map 7(@) = (b, @) for 
a € A", and ¢ is a positive existential operator over %. Then by Proposi- 
tion 3.2.3, gy” is T()-representable by a formula @(y) say. Then 


R(a@) © r(€@)E ~” & TM) O(b,a) fora@e A”, 


Hence R is T(%)-represented by 0(6, ). 
(ii) — (iit). This is just Corollary 2.3.5. 
(iii) > (i). This follows from Proposition 3.2.2. 0 


The following property holds for the r.e. relations on w and is one of the 
basic structural properties used in recursion theory and its generalizations. 


3.2.5. DEFINITION. A class I” of relations on A has the parametrization 
property if for each n > 0 there is a relation U C A“"' such that U € I and 
for each RC A” that is in F there is an a € A such that 


R=U,={a@€ A" | U(a,a)}. 


3.2.6. THEOREM. If ‘I is existentially acceptable, then IND(A— %) has the 
parametrization property. 


Proor. By Corollary 2.3.5 and Theorem 3.2.4 the set 'T(%)! is in 
IND(3 — %). We also need the following fact about coding the syntax 
of 'TQ)!. 
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The relation Sub is in =9(2) and hence in IND(43- 1), where for 
a,b,c © A Sub(a, b,c) & there is an elementary formula 6(x) in L(Y) 
with a = '@(x)! and there is b such that b = (b) and c = '6(b)!. 

Now, given n>0 let UC A"*' be given by 


U(a, @) Ax [Sub(a, (a), x) & x € (TM)! ). 


Using the closure properties of IND(3-%) we see that UE 
IND(4- 0). Now if RCA" is in IND(3-—%) then by 3.2.4, it is 
T(2)-represented by a formula 6(<) say. Let a = '0(<)'. Then for d € A” 


R(a) @ '0(a)'E€ TQ) 
© U(a, a). 


Hence R=U,. O 


3.3. Positive elementary induction 

In this subsection we will outline some of the properties of positive 
elementary induction. The theory has been presented in MoscHOVaKIS 
[1974a] and readers should look there for a detailed development of the 
subject. 

If # is the class of elementary formulae of L%, then we shall write 
IND(Y) for IND(pos-A(M)) and «(Y%) for | pos-F(Y)|. 


3.3.1. PRoposition. IND() is positive elementary closed over \ (i.e. if 0 is 
section codable in the class of positive elementary operations, then IND(X) is 
closed under 0). Hence the relations =, Ri,..., R; and their complements 
are in IND(%) and IND(Y) is closed under v", A", 3" and VW" forn > 0. 


This is an application of Proposition 3.1.4. 


3.3.2. Proposition. Let ® be a rule set on A that is regular elementary over 
Yt. Let p : Pow(A )— Pow(A ) be the associated monotone operator. Then » 
is positive elementary over YX and hence 1(®) = I(¢) € IND(Y). 


Proor. Let ® be the set of rules R,—b such that S(a,b). Then for 
X CA, 
p(X) ={b EA | Ay [S(y, b) a Wx (R(a, x) > X(x))]}- 


If R and S are replaced by their elementary definitions then we obtain an 
elementary definition of g O 
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3.3.3. Coro.Lary. If % is acceptable, then 'T.(21)'€ IND() and hence 
every T.(%)-representable relation is in IND(Q). 


Proor. The first part follows from Propositions 3.3.2 and 2.3.10. The last 
part is proved as in the last part of Corollary 2.3.5 using some of the closure 
properties of IND(%) given in 3.3.1. O 


3.3.4, PRoposiTION. If p : Pow(A™)— Pow(A”™) is positive elementary over 
XM, then I(¢y) is T.(2)-representable. 


Proor. This result is proved as in the proof of Proposition 3.2.3. The claim 
used there must be modified by replacing T(%) by T.(%) and allowing 
6(X) to be any elementary formula of L(2!). The A-rule of T.(2) is just 
what is needed in order to take care of the proof for the case that 0(X) isa 
universal quantification. OO 


As a consequence of the previous two results we have: 


3.3.5. THEOREM. For acceptable ‘t and relation R on A, R EC IND(%) iff R 
is T.(2l)-representable. 


As in the proof of Theorem 3.2.6, we have: 


3.3.6. CoroLtary. If % is acceptable, then IND(%) has the parametriza- 
tion property. 


The following result is called the Abstract Kleene Theorem in Mos. 
CHOVAKIS [1974a]. The proof given there uses a quite different method to 
the one we use. 


3.3.7. THEOREM. If Y% is a countable acceptable structure and R is a relation 
on A, then R © IND(Q) iff R is TH (0). 


This theorem is an immediate consequence of Corolary 2.3.8 and 
Theorem 3.3.5. 


3.3.8. DeFiIniTION. A norm ona set R isa map 0: R-»A of R onto an 
ordinal A. A is called the length of o. 


If R CA” then associated with o are the 2n-ary relations <3 and =2 
given by 


766 ACZEL/ AN INTRODUCTION TO INDUCTIVE DEFINITIONS [cH. C.7, §3 
a@<*b © R(a) & (R(b) > a(a)< a(b)), 


@<*b © R(a) & (R(b) > a(a) = a(b)), 
for gabe A". 


3.3.9. Exampte. If ¢@ : Pow(A")—Pow(A"), then | |,:¢*—|@! is a 
norm on g”. We write <* and =* rather than <* and =* when a =| |,. 


3.3.10. DEFINITION. Let I be a class of relations on A. If R is a relation on 
A, then anorm ao on R isa I’-norm if <* and =* are in I. I is normed if 
every relation in I has a ’-norm. For normed I let o(/’) be the supremum 
of the lengths of the F-norms. 


3.3.11. PRoposiTion. IND(QQ0) is normed. 


Proor. Let R=7 'e” where 7 is a section map and @¢ is positive 
elementary over YI. Let o.(@) =|7(@)|, for @ € R. oo: R || may not be 
a norm as its range may not be an initial segment of ordinals. But there is a 
unique order preserving function f mapping the range of a» onto an ordinal 
A. Then if o(a@) = f(o0(@)) for @E R, a7: RA isanorm on R of length 
ARN lel. 7 7 

Note that for 4,bE A”, 4@<*b iff 7(@)<*7(b) and similarly for <§. 
Hence to prove the proposition it suffices to show that <} and <2 are in 
IND(Q). This follows from: 


3.3.12. First STAGE COMPARISON THEOREM. Let ¢ : Pow(A")— Pow(A") 
be a positive elementary operator. Then there are positive elementary 
operators y.., p<: Pow(A*")— Pow(A*") such that 


<*=I(y.) and <7} =I(¢z). 


Moreover, p-(X) = {(a, b)€ A?" | (b, 2) € p-(X)} for X CA”™. 


Proor. g. and gz are defined as follows. Let 


(b,%)EX})} for XC A™, 


6(X)={(4, b)E A” 


where ¢'(Y)=@(Y)UY for YCA”*. Then let o<(X)= 6(0(x)) for 
X CA" and define y. as required by the theorem. Then observe that @ is 
positive elementary and hence that y= and ¢- are also. The proof that 
these operators inductively define =3 and <% follows easily from the 
lemma below. 0 


GE g'({x EA" 


cu. C.7, §3]} CLASSES OF INDUCTIVE DEFINITIONS 767 
3.3.13. Lemma. For all ordinals A, 
@<*b&aeg’* So (4bEgi, 


€@=*b&aGEg* S(4b)Egt. 
Proor. This lemma is proved by induction on A. O 


3.3.14. DEFINITION. A relation R on A such that both R and —R are in 
IND(%) is called hyperelementary over Y{. We write HYP(2) for the class of 
such relations. 

This class generalizes the class of hyperarithmetical relations on w. 


3.3.15. Coro.iary. If g : Pow(A")— Pow(A") is positive elementary over 
WM, then o* EHYP(M) for all A </¢}. 


Proor. If A <|g], then A =|4@|, for some @ € I(¢). So 


pe ={x ECA" |x Sta} ={x EA" |(x,a)€ gt ={¥ EA" |(4 2) E G2}, 


Tp ={K ECA" G<EX}={KEA"|(4,X)€E Ge}. 


So if r(x) = (@, x) for * E A”, then 
yg =7 'S= and ag*=7'~ee. 


Hence gy* CHYP(M). O 


The basic properties of the IT; relations on w, and more generally of 
IND) for % an acceptable structure, are incorporated in the following 
important definition first introduced in MoscHovakis [1974a]. 


3.3.16. DEFINITION. A class I’ of relations on a set A is a Spector class over 
MC if 

(i) F° is positive elementary closed over YI (see Proposition 3.3.1), 

(ii) / contains a coding scheme € on A (i.e. N, =, Seq, the graphs of 
Ih, gq and s and the complements of all these are in I’ where these are 
defined in Definition 2.3.2), 

(iii) has the parametrization property (see Definition 3.2.5), 

(iv) Fis normed. 


3.3.17. THEOREM. If % is an acceptable structure, then IND(2) is the 
smallest Spector class over %. 
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This result also holds under the slightly weaker hypothesis that I is 
almost acceptable, where this means that IND(2) contains a coding scheme 
on A. The fact that IND(2l) is a Spector class over Y is obtained by 
combining Proposition 3.3.1 and 3.3.12, and Corollary 3.3.6. To see that it 
is the smallest one requires the following result about Spector classes. 


3.3.18. THEOREM. Let [ be a Spector class over 2%. Let 
¢g :Pow(A")— Pow(A") be positive elementary over &. Then I(e) ET. 


Proor. We sketch a proof of this. The details may be found in 9A of 
MoscHovakis [1974a]. By the parametrization property of [ choose 
UCA"*’ in I to parametrize the (n + 1)-ary relations in 7. Let tr: U> k 
be a F-norm on U. Let Q(t,a)@ @€ o({x EA" | (t, t,x)<*(t,t, a)}). 
Then as I is positive elementary closed Q EI and hence Q = U, for 
some a EA. 

Let P(a) & Q(a,a@) for aE A". Let oo: P—k be given by o(a)= 
t(a,a,@) for a € P. a» may not be surjective, but by composing with an 
order preserving mapping of the range of a» onto an initial segment of 
ordinals we obtain a norm o on P. Then for aE A’, 


P(@4)@ 4E (XE A" |¥<*4)}). 
But it may easily be shown that any P with a norm satisfying this 


equivalence must be identical with I(g). Hence as PET, (gE. O 


The ordinal « (21) may be characterised in terms of IND(2) as the sup of 
the lengths of the positive elementary inductive norms over Y, i.e.: 


3.3.19. THEOREM. x (%) = o(IND(2)). 


The next result generalizes the Spector-Gandy theorem for the []} 
relations on w. 


3.3.20. ABSTRACT SPECTOR-GANDY THEOREM. Let % be an acceptable 
structure. If RCA", then R EIND(Q) iff there is an elementary second 
order relation R such that for ae A" 


R(a@) @ AX EC HYP(A) A(X, a). 


This result was first stated and proved in Moscuovakis [1974a]. A 
simplification of that- proof was given in AczEL [1972]. It used a new proof 
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of Moschovakis’s second stage comparison theorem (7 C.1. of Mos. 
CHOVAKIS [1974a]). The following consequence is the main fact needed. 


3.3.21. Proposition. Let g :Pow(A")— Pow(A") be positive elementary 
over %. Then there is a second order relation 2 that is elementary over 
such that if 4 E I(g~) or b E I(g), then the following are equivalent 
(i) 4=3 6, 
(ii) 3X 2(X, a,b), 
(iii) AX € Hyp(W) A(X, @, b). 


Proor. Let gs: Pow(A7")— Pow(A") be the operator used in the First 
Stage Comparison Theorem 3.3.12. Let 2(X,a) @@EX & X Coy=(X), 
for X CA” and a € A”. Then 2 is elementary over %. To see that (i) and 
(ii) are equivalent, 


€@<*b oO 4b <*a4 © (ba) € 02 © (4,6) Z(G-)" 
@ AVY[6.(Y)CY > Y(4,5)] 
© 3X [¢.(4X)C AX & X(a, b)] @ 3X Q(X, G, 5). 


As (iii) > (ii) is trivial it only remains to show that (i) > (iii). So let a =% b. 
Then for some A <|¢|, @ € g* and hence by Definition 3.3.13, (4,6) € 
gi. As gi Ce(e2) it follows that 2(~4,4,b). As A<|g!|=|gs| it 
follows from Corollary 3.3.15 that g=|E HYP(21). Hence 3X € 
HYP() 2(X, 4,6). O 


We end this outline of the theory of positive elementary induction by 
giving, without proof, Moschovakis’s normal form theorem for an accept- 
able structure. For this we need to introduce the game quantifier. 

In general, by a quantifier on a set A we mean a set QC Pow(A). If 
RCA we write Qz R(z) instead of R EQ. For example the existential 
and universal quantifiers on A are 4 = Pow(A)-— {@} and V = {A}. 

Given an elementary coding scheme for a structure % the game quantifier 
G on A is the set of R CA such that 


(*) {Vx, dy, Vx2.dy2---} Vv R((X1, Yay - ++) Xmy Ym ))- 


Eq. (*) is interpreted in the usual way in terms of a game §(R) between 
two players J and V who alternately choose elements of A, starting with V, 
to produce a sequence xX, yi, X2, y2,... . If for some m, (X1, Yi, --- 5 Xm» Ym) © 
R, then player 3 wins. If this never happens then player V wins. Now (*) is 
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interpreted to mean that player 3 has a winning strategy for the 
game 4(R). 


3.3.22. THEoreM (5C of MoscHovakis [1974a]). Let % be an acceptable 
structure. Let G be the game quantifier on A relative to an elementary coding 
scheme for %(. Then for RCA", R EIND(Q) iff there is an elementary 
relation SC A"*' such that forad€ A", 


R(a@) © GzS(z, a). 


3.4. Relativisation to a non-trivial monotone quantifier 

In this subsection we indicate how the notions and results of the last 
section can be generalised by relativising to a non-trivial monotone 
quantifier Q on A. QC Pow(A)) is non-trivial if Q 4 @ and Q # Pow(A). Q 
is monotone if A D X D Y € Qimplies X € Q. The quantifier Q dual toQ 
is given by Qz R(z) & 4Qz 4 R(z). The language L*(Q) is defined like 
L* except that formulae Qz $(z) and Qz (z) are allowed for any formula 
$(z). 

The standard interpretation of L* is extended to sentences of L*(Q) by 
requiring that 


Qxd(x)istrue iff {a€ A | b(a) is true}EQ, 


and similarly for Qx d(x). 

The language L(I, Q) is the sublanguage of L*(Q) corresponding to the 
sublanguage L(%) of L*. Positive and negative occurrences of relations in a 
formula of L*(Q) are defined by treating Q and Q in the same way as the 
ordinary quantifiers. Define the Q-elementary formulae of L*(Q) like the 
elementary formulae of L* except that Qx and Qx are allowed. Now 
define IND(%l,Q) as in the definition of IND(!) except using positive 
Q-elementary operators instead of the positive elementary ones. 

Now the result of 3.3 concerning IND(2) will carry over to IND(%I, Q) 
with suitable changes, i.e. “elementary” should be replaced by “‘Q- 
elementary’. Where I is required to be acceptable, it is sufficient here that 
Yt is Q-acceptable, i.e. there is a Q-elementary coding scheme. Among the 
positive Q-elementary operators there are the Q” and Q" defined like V" 
and 3". The theory T.(2I) needs to be replaced by the theory T.(%, Q). 
This is obtained from T.(%) by allowing Q-elementary formulae and 
adding the following possibly infinitary rules. 

Q-rule: From 6@(a) for a & X infer Qx @(x) if X EQ. 

Q-rule: From 6(a) for a € X infer Qx O(x) if X EQ. 
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(See AczeL [1970] where such rules are considered.) Using T.(2,Q), 
results 3.3.3-3.3.6 carry over to IND(%, Q). There does not seem to be any 
analogue of Theorem 3.3.7. Theorem 3.3.17 carries over to yield that if 2 is 
Q-acceptable then IND(%I, Q) is the smallest Spector class over % that is 
closed under each Q” and Q". 

The Abstract Spector-Gandy Theorem 3.3.20 also carries over to 
IND(I, Q). Of course HYP(2,Q) must be used instead of HYP(%). 

The Normal Form Theorem 3.3.22 also carries over, but for this it is 
necessary to generalise the game quantifier. This has been carried out in 
AczeEL [1975] where it is shown how to interpret any infinite string 
Q,x, Q2x2:++ of non-trivial monotone quantifiers as the following alternat- 
ing string of ordinary quantifiers: 


qx,€Q0,Vx,€ X,3IX,€ Q.Vx.€ X2: rare Oe 


This can be interpreted in terms of a game between two players as in the 
definition of the game quantifier. Now given a Q-elementary coding 
scheme for the structure %{ we -define the relativisation Q* of the game 
quantifier as the set of R CA such that 


{Qx, Ay, Wx2Fy2QxsQysee3} Vo R(X, Yin 0. 6s Xs Yd) 
mEw 


Note that V* is just the game quantifier G. Theorem 3.3.22 carries over to 
IND(, Q) if G is replaced by Q*. 
We end this section by stating an important recently obtained result. 


3.4.1. THEOREM (HARRINGTON [1975]). Let % be an acceptable structure. 
Every Spector class over Yt has the form IND(l,Q) for some non-trivial 
monotone quantifier Q on A. 


3.5. Non-monotone induction 

The first examples of non-monotone inductive definiticns were those 
that appeared in the construction of various systems of notations for larger 
and larger segments of the countable ordinals. The first such systems were 
those of Church—-Kleene for the recursive ordinals. One such example is 
Kleene’s © considered in 2.2. The first ordinal not having a notation in @ is 
the Church-Kleene ordinal w,, that is a recursive analogue of the first 
uncountable ordinal. The inductive definition of © uses a monotone 
operator. But attempts to extend © to systems of notations for recursive 
analogues of the higher number classes soon lead to non-monotone 
operators. For example let ¢@:Pow(w)— Pow(w) be the monotone 


772 ACZEL/AN INTRODUCTION TO INDUCTIVE DEFINITIONS (cu. C.7, §3 


operator associated with the regular arithmetical rule set generating 
Kleene’s © in 2.2. We may extend 0 by adding a notation 7 (say) for the 
ordinal w, and then continuing as before, i.e. we define a non-monotone 
operator ¢,: Pow(w)— Pow(w#) given by 


on if (X) ZX, 

o(X) = for X Cw. 

{7} if (xX) CX 

Then for A <!¢/=,, d} = ¢" so that 7: = $<” U{7} = 6 U{7}. Hence 
|7|s, = a1. It is not hard to see that |¢,|= w,+ , so that 7 is a set of 
notations for the ordinals <w,+w,. The above may be continued very 
much further and leads to notation systems for much larger ordinals such as 
recursive analogues of the finite and transfinite number classes. (See 
ADDISON and KLEENE [1957], KREIDER and Rocers [1961], PUTNAM [1961] 
and RICHTER [1965, 1967, 1968].) 

The above work was concerned with the details of specific notation 
systems. In PUTNAM [1964] a more general approach was taken where it was 
shown that arbitrary A} inductive definitions can only give notation systems 
for an initial segment of the A} ordinals (i.e. the order types of A} 
well-orderings of natural numbers). Further work has shifted the interest 
from the study of specific inductively defined notation systems to the study 
of classes € of inductive definitions on w, and the associated ordinals | € |. 
This shift, encouraged by Gandy, led to the work of RICHTER [1971], ACZEL 
and RICHTER [1972], RICHTER and AczeE [1974], AANDERAA [1974], CENZER 
[1974] and RicuTER [1976]. This work gives characterisations of | @|, for 
various classes @, in terms of recursive analogues of large cardinals. Such 
analogues were defined using Kripke’s theory of recursion on admissible 
ordinals. The starting point is to take the admissible ordinals as the 
recursive analogue of the regular ordinals. It turns out that the first 
admissible >a is the Church-Kleene ordinal w, so that the two ap- 
proaches to recursive analogues agree. Other analogues are the recursive 
inaccessibles (admissible limits of admissible ordinals) and the recursively 
Mahlo ordinals (admissibles a such that for every a-recursive f:a—-a 
there is an admissible B < a closed under f). For recursive analogues of 
even larger cardinals reflection properties were introduced in RICHTER and 
AczeL [1974]. Examples are the II?,,-reflecting ordinals which give a 
recursive analogue for the IT,-indescribable cardinals for n >0. 

Below we shall state some of the results obtained. But first we need a 
construction invented in RICHTER [1971]. Given ¢, & : Pow(w)— Pow(w) 
define [¢, w]: Pow(w)— Pow(w) by 
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o(X) if 6(X)ZX, 
(asco =| for X Cw. 
w(x) if 6(xX)CX 


Note that ¢, introduced above is [¢, #] where #(X) = {7} for all X Cw. 
If €,, $2 are classes of operators on w let [%,, $2] = {[¢1, $2]| d: € &, and 
2 E 2}. 


3.5.1. Proposition. (i) | Io] = . 
(ii) (Gandy, unpublished) [TI] = a. 
(iii) (PUTNAM [1964]) |A3| = 63, the first non A} ordinal. 
(iv) (Ricuter [1971]) | [1o, 10] | = w7, | [1?, 10} | = first recursive inacces- 
sible, |[T?, ]| = first recursively Mahlo ordinal. 
(v) (AczeL and Ricnter [1972]) |M%| =]38.,| = first 1.1 reflecting 
ordinal. 
(vi) (AczEL and RicuTer [1972]) |I[i| = first TI} reflecting ordinal, 
[21] = first X} reflecting ordinal. 
(vii) (RICHTER [1976]) |I13| = first TI} reflecting ordinal. 
(viii) (AANDERAA [1974]) |Ti]<[2i{ and |323|<| Mh]. 


Many more results may be found in the above-mentioned papers. It is 
interesting to compare these results with corresponding results for classes 
of positive and monotone operators on w. 


3.5.2. PRoposiTion. (i) | pos-2?] = |mon-2{| = w. 
(ii) (Specror [1961]) |pos-IT?| = |mon-IT}| = . 
(iii) (Grilliot, unpublished) | pos-X{] = {mon-2}| = |i]. 
(vi) (Gandy, unpublished) | pos-23| = |mon-22| = 3. 


Moschovakis has recently raised the problem of characteristing 
| pos-II3| = |mon-I13|. It is not even known whether this ordinal is 
admissible. 

In the above we have only stated results about | @|. Of course the class 
IND(@) is also of interest. Under general conditions on € the ordinal | @| is 
admissible and IND(@) is a Spector class with | € | = o(IND(@)). Moreover 
in many cases considered in RICHTER and Azcev [1974], IND(@) may be 
characterized, in terms of a-recursion, as the class of a-r.e. relations on w 
where a =|@]. 

The theory of non-monotone induction on w has been most elegantly 
generalised to abstract structures in MoscHovakis [1974b]. The central 
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notion of that paper is that of a typical, non-monotone class of second 
order relations over an abstract structure °{. Examples are the classes of 27 
or ITZ definable second order relations over 9{ when m,k = 1 or m =0 and 
k =2. In case m=O these classes are defined relative to a hyper- 
elementary coding scheme over YI, as in 2.3. 


3.5.3. THEOREM (MoscHovakis [1974b]). Let Z be a typical, non-monotone 
class of second order relations on %. Let € be the set of operators 
& : Pow(A")— Pow(A") such that {(%, X)€ A" x Pow(A")|% € o(X)} is 
in J. Then IND(@) is a Spector class over % with |€|=o(IND(@)). 


Moschovakis goes on to characterise IND(@) as the smallest Spector 
class over YI satisfying certain additional conditions. 

We end this subsection by stating a special case of an interesting general 
result of HARRINGTON and Kecuris [1975]. 

Let Y% be an acceptable structure, and let  (pos-@, mon-@) be the class 
of elementary (positive elementary, monotone elementary) operators over 
%X. Let WF ={S C A?|S is well-founded}. 


3.5.4. THEOREM (HARRINGTON and Kecuris [1975]). If WF is elementary 
over %, then IND(@) = IND(mon-@). 


This result is in contrast to the situation for countable % when 
IND(pos-@) = I} (QQ) = IND(mon-@). (IND(@) is always very much larger 
than IND(pos-@).) 


3.6. Induction and admissible sets 

In this subsection we will assume some familiarity with the notion of an 
admissible set (see Chapter A.7). 

Given an admissible set A, let 21 =(A, €[ A). The class of %, formulae 
of L* are built up from the atomic formulae and their negations using v, 
A, dx and the restricted universal quantifier Vx € y. As in 2.3 the 2, 
relations over %{ are those relations on A definable over A by a %, formula 
of L(Y). (Note that this means that constants for elements of A 
are allowed.) As in 3.1 we define the class of operators 
@ : Pow(A")— Pow(A") that are positive 2, over 1, the ordinal o(A) of A 
is the smallest ordinal not in A. Finally, recall that a relation R C A” is 
A-finite if R © A. (Note that, as A is closed under pairing, A" C A for 
n >0. So every relation on A is a subset of A.) Much of the interest in 
admissible sets centres around the infinitary languages L, that are 
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associated with them. Recall that these are extensions of first order 
languages where the formulae are set-theoretically represented as elements 
of A, and infinite conjunctions and disjunctions A ® and V @ are allowed 
as long as @ is an A-finite set of formulae. Just as the syntax of first order 
languages is full of effectively presented finitary inductive definitions, the 
syntax of L, uses the following. Below let A be a fixed admissible set. 


3.6.1. DEFINITION. (i) A rule set ® on A is A-finitary if the set of 
premisses of every rule in ® is A-finite. 

(ii) An A-finitary rule set ® is , over % if it is so as a binary relation 
on A. 


The following result essentially generalises Proposition 2.1.2 (in fact 
2.1.2 is essentially the special case when A is the set HF of hereditarily 
finite sets). 


3.6.2. Proposition. If ® is an A-finitary rule set that is %, over %, then 
I(®) is X, over W. 


Proor. Let ¢ : Pow(A )— Pow(A) be the monotone operator associated 
with ®. Let Pra(a,b) if aE A is a }-proof of b (see Definition 1.4.1). 
Then it follows easily from the assumptions that Pre is 2, over YW. Hence it 
would suffice to show that I(@)={b € A |3a EA Pro(a, b)}. But the 
natural proof of this will only work if % satisfies ‘‘every set can be 
well-ordered’’. 

Hence we use a modification of the notion of @-proof. We say that 
{au }.<, is a &-quasi-proof of b if 

(i) a, = {b}, and 

(ii) a, C 6(U,-.a,) for all » =A. 
As in the proof of Proposition 1.4.2 we can show that for an arbitrary 
monotone operator 


(*) I(¢) ={a € A |a has a $-quasi-proof}. 


But the proof of (*) requires no form of the axiom of choice, in contrast to 


the proof of 1.4.2. 
Now let qPrea(a, b) if aE A is a }-quasi-proof of b. As qPro is easily 
seen to be &, over YF it suffices to prove the 


Claim. I(@)={b € A |da EA aPre (a, b)}. 
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The inclusion D follows from (*) above. For the other inclusion it is 


sufficient to show that the right-hand side is ®-closed. So let 6: X—b 
such that every element of.X has a ¢-quasi-proof in A, i.e., 


Vx € Xda € A gPre(a, x). 


We must find a ¢-quasi-proof of b in A. As qPre is 2%, over Yt we may use 
strong 2, collection to find a set Y€ A such that 


Vx € X da € Y gPro(a, x) 
and 
Va€ Y3x EX gPre(a, x). 


It follows that each a € Y is a sequence {a,},<,,. Let A be the least 
ordinal such that A > A, for all a € Y, c, = {b} andc, = Uf{a, |a © Y and 
# =A,} for w <A. Then as A is an admissible set A < o(A) and c EA. 
Also, by construction c is a @-quasi-proof of b. Hence Jae€ 
A qPre(a,b). O 


Note that it follows from the above claim that only ¢-quasi-proofs 
{a,},<, of length A <o(A) are needed. Hence || <0(A). 


Remark. The same proof shows that if 2% is a model of ZF, then I(®) is 
first order definable over 2! whenever @ is an A-finitary rule set that is first 
order definable over 2%. 

There is an approach to Barwise’s compactness theorem for L4, when A 
is a countable admissible set, that makes use of the class of positive 2, over 
YM inductive definitions. This idea was first suggested by Gandy.-A version 
of this approach may be found in Aczet [1973]. Here we will just consider 
the following result. 


3.6.3. THEOREM (Gandy). Let ¢ : Pow(A")— Pow(A”") be positive 2, over 
WY. Then 

(i) if RCA" is 3, over U, then a € (R) implies a € O(R’) for some 
A-finite R'C R. 

(ii) I(@) is XZ over W and |p| =o(A). 


ProoF. (i) It suffices to show that for each 2, formula 0(X) of L(2), 
containing positive occurrences of the n-ary relation variables X, but no 
other free variables, if RC A” is 2, over %, then 6(R) implies 0(R’) for 
some A-finite R’C R. This may be proved by a straightforward induction 
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on the way 6(X) is built up. The key case, when 0@(X) is a restricted 
universal quantification, requires an application of %,-collection. 

(ii) Let ® be the set of rules X — 4 such that X C A" is A-finite and 
a € $(X). Then clearly ® is a &, over M, A-finitary’ rule set. Hence by 
Proposition 3.6.2, I(®) is %, over %, so that it suffices to show that 
I(¢)=1(®). Note that @ is not necessarily the monotone operator 
associated with ®, but by (i) it is so on &, over YI relations, and this will 
suffice. To show that I(¢) € I(®) it suffices to show that I(®) is }-closed. 
So let  E€ P(1(®)). As I1(@) is E, over WM, by (i) there is an A-finite 
X CI{®) such that x € d(X). Hence ®: X— <x so that x E1(®). To 
show that I(®)C I(@) it suffices to show that I(@) is ®-closed. So let 
@®:X— <x where X C1(¢). Then * € (X) C 6(1(¢)) = I() as required. 
Finally, |¢@| = 0(A) may be seen to follow from the note at the end of the 
proof of Proposition 3.6.2. 


We end this subsection by stating the central result of BARwIsE, GANDY 
and Moscuovakis [1971]. 


3.6.4. DeFinition. Let A be an admissible set. 

(i) 7:D—>A is a projection of x on A if DCx€EA and wm is a 
surjection onto A. 

YW is projectible to x if A admits a projection on x that is A, over XI (i.e. 
the graph of the projection and its complement are both %, over 2). 

(ii) t:0(A)—A< isa resolution of A if A = U.eway (a). 

% is resolvable if A admits a resolution that is A, over Y. 


3.6.5. THEOREM (BARWISE, GANDY and MoscHovakis [1971]). Let A be any 
transitive set closed under pairing. Let 


A* = (\{B|A © B and B is admissible}. 
Then 
(i) A* is admissible, 
(ii) HYP(21) is the set of A’-finite relations on A and x(U)=0(A‘)., 
and if UW =(A*, E[A*), then 
(iii) YW is resolvable and projectible to A and IND(Y) is the set of %, over 
1" relations on A. 


Remarks. The above result has been generalised in two directions. Firstly, 
IND() can be replaced by an arbitrary Spector class [ over Y= 
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(A, EA) for any transitive set A. Then A* must be replaced by M(A) 
where A =! N—TP and M(4)=(\{B|A CB and B is admissible}. 

In (it), HYP(21) and x (2) must be replaced by A and o(I) respectively. 
(ili) is replaced by (iti)’: 

(iii)’ There is a relation R on M(A) such that M(A) is admissible 
relative to R and “(4)=(M(A), ©! M(A), R) is resolvable and projecti- 
ble to A. Moreover I is the set of relations on A that are X, over M(A). 

Although R and hence (A) are not unique the class of relations on 
M(A) that are 2, over (A) is independent of the choice of R and is called 
the companion of I. This generalization is 9E.1. of MoscHovaxis [1974a]. 
The second generalization is to allow % to be an arbitrary abstract 
structure. This requires the notion of an admissible structure with set A of 
urelemente that has been introduced in BARwIsE (1974, 1975]. The details 
have been worked out in Ennis [1975]. There, Ennis shows that the result 
still holds when (ii) in the definition of a Spector class is weakened to (ii)’: 

(ii)’ I contains the graph of a pairing function on A. 


4. Induction in foundations 


In this final section we shall briefly consider the role of inductive 
definitions in the context of foundations. So far we have taken for granted 
the standard framework of modern mathematics, formalisable in ZF set 
theory. But the concept of an inductive definition can also be considered 
within the other conceptual frameworks that have arisen in work on 
foundations (e.g. finitist, predicative or intuitionistic mathematics). Within 
a non-classical framework there arises the question of which inductive 
definitions can be justified. 

It will be helpful to make the distinction between fundamental and 
non- fundamental inductive definitions, following the terminology of §53 of 
KLEENE [1952]. This is a distinction concerning the context in which an 
inductive definition is presented. The inductive definition of the domain N 
of natural numbers is most naturally presented as a fundamental definition 
of a new sort of object. But in the context of ZF set theory the natural 
numbers Q, {9}, {6, {O}},... are objects that exist independently of the 
inductive definition of w. So the latter is a non-fundamental inductive 
definition. Another example is the inductive definition of the set of even 
numbers as the smallest set of natural numbers containing 0, and n +2 
whenever it contains n. 
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The standard model of ZF set theory is given by a fundamental inductive 
definition of the universe of (well-founded) sets. This consists of the 
collection of rules: If s is a set of objects of the universe, then s itself is an 
object of the universe. 

While the fundamental domains associated with a conceptual framework 
will usually be explicit, it may require further analysis to decide which 
non-fundamental inductive definitions are justifiable. For example, given a 
collection of rules ® on the domain N of natual numbers, we have defined 
I(®) as Nx CN | X is ®-closed}, i.e., 


n€1()o (VX CN)[X is @-closed > n € X]. 


But this is an impredicative instance of the comprehension scheme (a 
subset of N is defined using quantification over all subsets of N). If 
impredicative definitions are not allowed we must look for other possible 
definitions of I(®). In case @ is finitary we can use Proposition 1.1.4. In 
general I(®) could be defined using transfinite iterations of operators, as in 
Proposition 1.3.1 or transfinite proofs, as in Proposition 1.4.2. But this 
would require a suitable notion of ordinal which itself would need an 
inductive definition. An alternative approach is to take induction as a 
primitive method of definition, not needing justification in terms of other 
methods. 

An example of a formal system treating induction as primitive is the 
system ID, of FEFERMAN [1970] (see also FRIEDMAN [1970], ZUCKER [1973] 
and Martin-Lor [1971]). The language of ID, is obtained from the 
language of formal arithmetic by adding a new n-ary relation symbol P, 
for each arithmetical formula @ = $(X, Xx) containing only positive occur- 
rences of the n-ary relation variable X and at most the free individual 
variables ¥ = x,,...,X,. The axioms for ID, are obtained from the axioms 
for formal arithmetic by extending the mathematical induction scheme to 
all formulae of ID, and adding the following axiom and axiom scheme for 
each Py. 

(i) Wx (b( Py, X)—> Po (X)), 

(ii) VE (@((z | (ZX) > WR) > VE (Pa) V(X), 
for all formulae w(Z) of ID,. 

In (ii), O({Z | w(z)}, ¥) denotes the result of replacing each occurrence of 
X(t) in 6(X, £) by W(t), where ¢ =f,,..., 1, is a sequence of terms, and 
bound variables are changed as required by the usual conventions. 

On the standard model of arithmetic these axioms express that Py, is 
inductively defined by the positive arithmetical operator defined by 


b(X, x). 
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The procedure for constructing ID, from formal arithmetic may be 
repeated to obtain ID, ID;,.... But the resulting systems are still much 
weaker than fully impredicative systems such as second order arithmetic. 
On the other hand ID, is already stronger than the systems of predicative 
mathematics of FEFERMAN [1968]. Thus inductive definability is a notion 
intermediate in strength between predicative and fully impredicative 
definability. 

It would be interesting to formulate a coherent conceptual framework 
that made induction the principal notion. There are suggestions of this in 
the literature, but the possibility has not yet been fully explored. 
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Introduction 


Our aim is to describe classical and effective descriptive set theory, with 
emphasis mainly on projective sets. We also give an account of the revival 
in this subject which has taken place in the last ten years, a revival based on 
strong set theoretic hypotheses — notably, projective determinacy. Our 
account has been strongly influenced by writings, lectures, and private 
remarks of Y.N. Moschovakis and A.S. Kechris. 

The study of arbitrary sets of real numbers has been notoriously 
unsuccessful. Cantor’s continuum hypothesis asserts that every set of real 
numbers is countable or has power 2, the cardinal number of the set of all 
real numbers. The continuum hypothesis is not merely undecided: the 
famous theorems of G6pEL [1940] and CoHEN [1963, 1964] show that it is 
undecidable on the basis of the usual Zermelo—Fraenkel (ZFC) axioms for 
set theory. Moreover, no candidates for new axioms are available which 
settle the continuum hypothesis and seem likely to achieve anything like 
general acceptance. There are other questions about arbitrary sets of real 
numbers which are less unmanageable, but whose solution leaves a very 
unsatisfactory situation. The theorem of Vitali asserts that not every set of 
real numbers is Lebesgue measurable, but its proof depends essentially on 
the axiom of choice. It remains possible that all sets of real numbers one 
encounters in practice are Lebesgue measurable. 

In descriptive set theory one restricts one’s attention to simple sets of 
real numbers: sets of simple topological structure or sets which are 
definable in some simple way. There are three main advantages to such a 
restricted interest. 

(1) Many questions which seem unanswerable for arbitrary sets (the 
continuum hypothesis) can be answered for sufficiently simple sets. 

(2) Many questions which have unpleasant answers for arbitrary sets 
have pleasant answers for sufficiently simple sets. 

(3) One derives an interesting structural theory of simple sets: a theory 
of definability. 

We can illustrate (1) and (2) by a well-known example. Perhaps the first 
theorem of descriptive set theory was this: Every closed set of real numbers 
is countable or is of power 2™°. Thus no closed set is a counterexample to the 
continuum hypothesis. This example of (1) is also an example of (2). The 
full Cantor-Bendixson theorem asserts that every closed set of real numbers 
is countable or has a perfect subset — a closed subset without isolated 
points. Not all sets of reals have this pleasant property. 

Descriptive set theory as we know it today has two independent sources. 
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First came the classical topological development originated by Borel, 
Baire, and Lebesgue, and carried out by Souslin, Lusin, Sierpinski, and 
others. Later, though essentially independently, a different approach to the 
subject, based on definability and recursive function theory, was produced 
by Kleene and other logicians. Only in AppIson [1959a] was it made clear 
that the two apparently distinct subjects were in fact one subject. 

In Sections 1-3 we present the main concepts of the classical theory. In 
Section 4 we deal with the effective, or Kleene, theory. In Sections 5-6 we 
discuss questions in descriptive set theory which the ZFC axioms are not 
sufficient to answer, and we discuss how proposed new axioms for set 
theory allow one to answer such questions. 


1. Basic concepts of classical descriptive set theory 


Although in some ways the reals are our main interest, we shall not take 
the real line as the basic space in terms of which the theory is developed. 

One possible reason for this is that the subject applies to many spaces 
other than the reals. In fact, it applies to any complete separable metric 
space without isolated points, or perfect Polish space. We could then 
develop the theory for arbitrary perfect Polish spaces. This is, for example, 
the approach of Moscuovakis [1978]. 

We prefer, however, to deal with a single concrete Polish space. We do 
not use the real line, because that space is slightly awkward to use. Instead 
we use Baire space w”, which we denote by W. 


1.1. DEFINITION. w is the set of all natural numbers. W is the collection of 
all functions a : w > w. We give w the discrete topology and, thinking of VY 
as the product of infinitely many copies of w, we give NW the product 
topology. 

A base for the non-empty open subsets of W is the collection of all sets of 
the form 


{a: a(n)=oc} 
where n is a natural number, o is a sequence of natural numbers of length 


n, and a&(n) denotes the finite sequence (a (0), a(1),...,a(n — 1)). 


N is homeomorphic to the space of all irrationals with the topology 
induced by the inclusion of the irrationals in the reals. 
Although we develop the theory for W, most definitions and theorems 


786 MARTIN/ DESCRIPTIVE SET THEORY (cH. C.8, §1 


apply equally well to any perfect Polish space. (We shall try to note the 
instances where this fails; but we shall not always indicate the cases when 
our proofs do not work for general spaces.) In fact the theorems for 
arbitrary perfect Polish spaces usually follow from those for VY using such 
facts as that any two perfect Polish spaces are ‘‘Borel isomorphic’’. Some 
examples of perfect Polish spaces are 

(a) the reals, 

(b) the Cantor set, which is essentially 2°, the set of all a : w — {0, 1}, 
with the topology induced by 2° CW. 

WN is homeomorphic to its product with itself. If a € WN, let ao(n) = a(2n) 
and a,(n) = a(2n + 1). The function f(a) = (ao, a:) is a homeomorphism. 
This homeomorphism preserves everything of importance in classical and 
effective descriptive set theory. Hence the notion of dimension plays no 
role in descriptive set theory. Thus, although our theory deals with finite 
product spaces W", we shall often state definitions and theorems simply for 
the case W, and no generality is lost in doing so. The fact that the simplest 
bijection of the reals onto the plane is slightly more complicated than a 
homeomorphism is one reason we do not base the theory on the reals. 


1.2, DeFinition. Let & be the collection of all finite products of W with 
itself: Y ={N,N?, N’,...}. Restricting a term of Moschovakis to the 
present context, we call a subset of an element of Y a pointset. 


We are also interested in collections of pointsets. We depart from 
Moschovakis’ terminology by making the irrelevance of dimension part of 
our definition, even at the cost of a slight artificiality. 


1.3. DeFiniTion. Let f : 4M — N’ be the homeomorphism defined above. If 
1<j <n, let f7:N"—~WN""' be the homeomorphism defined by 


Fi (Karr, G2, «+ On) = (Qs, Ar, 5 A—1 (Qj Jo, (Qj )1, O41, +++ On) 


where f(a;) = ((a;)o, (@)1). A collection I of pointsets is a pointclass if, for 
each n,j, and each A CW", 


AEPeofyAyer. 


A pointclass Fis then determined completely by {A: AGT & A CN}. 
If we were using the reals instead of W, we could not conveniently make the 
irrelevance of dimension such a basic part of the theory. Most definitions of 
particular pointclasses we give later will, nevertheless, make sense for the 
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reals, and practically all the theorems remain valid for the corresponding 
“‘pointclasses” of subsets of finite products of the reals. 


1.4. Derinition. If A CN” is a pointset, then A=N"-A. If T isa 
pointclass, the dual I’ of I is defined by 


AEPoAET. 
I’ is self-dual if P=. 


For example, if F is the class of all open sets, then I is the class of all 
closed sets. In this case I" is not self-dual. If I" is the class of all clopen 
(closed and open) sets, then J° = I. (The basic open subsets of W defined 
above are clopen.) 


2. Borel and projective sets 


We wish to study the properties of simple or definable subsets of W. This 
does not mean we study directly the general concept of definable set. There 
is such a general notion (ordinal definability from reals is the relevant one 
here), but it is as unmanageable as the notion of arbitrary set of reals. Part 
of descriptive set theory is indeed general in the sense that one proves that 
certain properties, typical of definable sets, imply other properties. For the 
most part, however, we consider only sets which are definable in specific 
simple ways. In the context of classical descriptive set theory, we study the 
sets derived from the open sets by specific simple operations. The two main 
kinds of operations we shall consider here are 

(a) Boolean algebraic operations 
and 

(b) projection. 

For the moment we consider only (a). 


2.1. DEFINITION. A pointset A C WY” ts Borel if A belongs to the o-algebra 
generated by the open subsets of V”. 


In other words, the Borel subsets of 4” are the smallest class [ 
containing all open subsets of V” and satisfying 
AElroAEl, 
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Clearly the class of all Borel subsets of elements of Y is a pointclass and is 
self-dual. 
We divide the Borel sets into a hierarchy as follows. 


2.2. DEFINITION. For countable ordinals p >0 we define inductively for 
fixed NV": 


AEX@A isopen, AERA EX, 


A € XP<there are sets Ao, A1,... such that each A; € IT’, 
for some p;<p and A= UA; (p>1), 


AECEMPACYKELAECTI. 
Each &}, II2, A? is a pointclass. It is easy to prove that 


UU =the class of all Borel sets. 


0<p<w 


Our notation is that of the Kleene school. Classically sets in &? are said 
to be of additive class p, sets in II? of multiplicative class p, and sets in Af of 
ambiguous class p. Also £2 =F,, TI) = Gs, &3 = Gee, etc. 


2.3. THEOREM. %?, II}, and AQ are closed under finite unions, finite intersec - 
tions, and continuous preimages. 


I is closed under continuous preimages if, whenever f:NV"—WN™ is 
continuous and A EJ, then f-'(A)€I. The proof of the theorem is 
routine. 


2.4. Derinition. U CN” is universal for a pointclass I if U € I and the 
set of all sets of the form 


{B: (a, B) € U} 


is exactly the collection of subsets of W which belong to I. 
Most important non-self-dual pointclasses possess a universal set. 


2.5. THEOREM (LEBESGUE [1905]). For each p < w, there is a set U universal 
for X}. 


PRoor. We first consider the case p = 1. Let Bo, B:,... be a base for the 
open subsets of W, with By empty. Let 
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(a, B)E Ue An (BE Ban). 


It is easily checked that U is the desired universal set. 
Suppose now that p >1 and U, is universal for 2° for each v < p. Let 
g:w—>{v:1<v <p} be such that each g '(v) is infinite. Let 


(a, B) € U Fi ((a, B) E Ug); 


where a,(n) = a(p;*') with p, the (i + 1)-st prime number. The functions h; 
defined by h,((a, B)) =(ai,B) are continuous. Since the classes II’ are 
closed under continuous preimages, {(a,8): (ai, B) € Uzw}E Mew, so 
U € 9. To prove that U is universal, use the easy fact that belongs to 
each Y. O 


2.6. COROLLARY. For each p < w,, there is a set U universal for T°. 


2.7. REMARK. The universal set property is one of a number of properties 
which tend to hold for I* if and only if they hold for I. 


We are now ready to prove a hierarchy theorem for the classes &}. 
2.8. THEOREM. Suppose 1= v <p <a. Then ALS XIE AS. 


Proor. A? C %? by definition. 
The main part of the theorem is &? ¢ A’. For this, let U be universal for 
~?. Let us put 


aEAec(aa)Ee U. 


A € X%, since &° is closed under continuous preimages. If A € A?, then 
A €II’. This means that for some a 


BEA (a, B) EU. 
But then 
(a,ay)E Uae A (aa) € U. 


=:C X$, since every open set is a countable union of closed sets (the 
basic open sets). If 1< » <p, then by definition 2° C 32. If 1s v <p and 
A Ell, then A = U,A,, where each A; = A. Hence TI2C 2, and so 
=? CTI’. We then have shown %°C A} for l= v<p. If 1S» <p, then 
=~ Ab, since VEDI CA. O 


We now consider a different operation for generating sets from the open 
sets, the operation of projection. 
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2.9. DeFIniTion. If A C N"*', the projection of A is 
{(ai-++a@n): SB ((an,..., @n, BYE AY}. 


The projection of A is then a subset of W”. If we wished to avoid product 
spaces, we could replace projection in what follows by continuous image. 
The following notion is due to Lusin [1925] and (apparently independently) 
SIERPINSKI [1925]. 


2.10. DEFINITION. The class of projective sets is the closure of the open sets 
under the operations of projection and complementation. 


One can show that the projective sets form a pointclass and are closed 
(within each VY") under unions and intersections (but not countable unions 
and intersections). We divide the projective sets into a hierarchy as follows: 


2.11. DEFINITION. For n € w we define inductively 
4 = &1 = the class of open sets, 
1, = %, 
A €31.:03B EM1}(A is the projection of B), 
AL=2,0M,. 
2.12. Remark. For the reals or 2° we must modify the definitions of 
projective and &), by replacing &{ by 2&2. Classically sets in 2; are called 


analytic, sets in TI} are called CA, sets in 3 are called PCA, sets in IT} are 
called CPCA, etc. 


The %), IT), and A} are pointclasses. It is easily proved that the class of 
projective sets is U,, 1. Once again we have a universal set theorem and a 
hierarchy theorem. 


2.13. THEOREM. %}, II, and Aj, are closed under finite unions, finite 
intersections, and continuous preimages. 


2.14. THEorEM (Lusin). There is a set U, universal for %) for each n. 
Proor. Let U*C N"*? be universal for 21. (Le., the sets of the form 


{(B, Bi,..., Bad: (a, B, Bi,...,B.)€E U*} are exactly the open subsets of 
A4"*') Assume n even for definiteness: let 
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(a, B)E U, << 3B, VB2°+- VB, (a, B, Bi,.--, Br) & U*. O 
2.15. THEOREM. ALS ¥)S A}... 
Proor. The proof is similar to that of Theorem 2.8. O 


The Boolean algebraic operations which generate the Borel sets are 
quite different from the operation of projection. Nevertheless, the two 
sorts of pointclasses are intimately related. In order to prove a theorem 
about their relation, we first develop some theory of II which will also be 
useful later. 

Assume n is odd for definiteness. Then A C W belongs to II just in case 
there is an open set BC N"*' such that 


a €A VB: 5B2°*-WBn (a, Bi,---, Bn) € B. 

The theorem below follows by the definition of open set. 
2.16. THEOREM. A EIT}, n odd [A € i, n even], just in case there is a 
relation R such that 

a €.A -VB, 5B.-:-VB, Im R(a(m), Bi(m),..., B.(m)), 

[a € A 3B, VB2--- VB, Im R(a(m), Bilm),...,B,(m))], 
where B(m) denotes the sequence (B(0),...,B(m — 1)). 

If A EI, there is an R such that 
a€A VB IAmR(da(m), B(m)). 
For such an R, define for a EN, 
X.,.r ={B(m): Vn sm 4 R(a(n), B(n)}. 

Let Seq be the collection of all finite sequences of natural numbers. 
2.17. Derinition. If o, 7 © Seq, let o < 7 mean that o is a proper exten- 
sion of r. < is a partial ordering of Seq. 

2.18. THEOREM. For A, R as above, 


aEA< < is well-founded on X,.r. 


Proor. Infinite descending chains in X,,.z correspond to functions B such 
that Vm 4 R(a(m), B(m)). O 
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2.19. DEFINITION. Let < be a well-founded relation on a set X. For x € X, 
define |x |~ inductively by 
'x|* = sup (|y |* + 1). 
yox 
The length of < is 


sup (|x |~ + 1). 
xEX 


We are now ready to state and prove one of the most important 
theorems of descriptive set theory. 


2.20. SousLiIn THEOREM. Aj} = the class of all Borel sets. 


Proor. The open sets belong to A} by Theorem 2.15, since ©? = X44 C A}. 
We prove that A} is closed under countable unions and complements. The 
case of complements is immediate. 

Let A = U,A, with each A; € A}. Since each, A; € 3}, 


(64 € A; @ SB (a, B) E B; 
with B; closed. Hence 
aEAcdi dp (a, BE B; © Jp (a, BYE Bao) 


where B'(n)= B(n+1). {(a, B): (a, B')E Baw} is closed. Since each 
A. Eh, 


with each B; open. But then 


where B,(n)= B(p;*') with p, the (i+ 1)-st prime number. Clearly the 
second implies the third; and if Vi 3B (i)(a, B(i)) € B,, let B; = B(i) and 
we have Vi (a, B;) € B;. Since 


{(a, B): Ji (a, Bi) & Bi} 
is open, we have shown that AGE XINM =A}. O 
Now consider any A € II}. Let R be as guaranteed by Theorem 2.16. By 
Theorem 2.18, 


a € A © < is well-founded on X,, pr. 
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2.21. LemMA. Suppose Jp < @,Wa € A (</ Xx has length =p). ThenA 
is Borel. 


Proor. For o € X,,r, let |o la =|o|~'*%~*, We show by induction on p < a, 
that, for each o € Seq 


fa: o£ Xr or|ol. =p} 
is Borel. For p =0, 
{a: cE X,.r Or |ola =O}= 1) fa: TEX, R}, 


which is closed, since {a: 7 © X,,x} is clopen for each 7. 
If p >0, 


{fa:a€ Xr orjal.=p}= 1 U {a: rE Xie or|rl. =v}. 


The lemma follows, since under its hypotheses, 
A ={a: <] Xx is well-founded with length = p} 
= U {a:co€X,20rlo|.=v}. O 


weSeq <p 
Suppose now that A € Aj. Let R and S satisfy 
a€AoVBANR(a(n), B(n)); 
aE A oVyI3nS(a(n), 7(n)). 
Let Y be the set of all triples (a(n), B(n), 7(n)) such that 
B(n)E Xie & W(N)E Xs. 


(Note that this condition depends only on the finite sequence @(n).) Say 
that 
(@,(n), Bi(n), Fi(n)) < (a(n), Ba(n), Fol) 
<> @,(n) < a(n) & B,(n) < Bn) & ¥(n) < 7An). 


< is well-founded on Y, since an infinite descending chain in Y would give 
a triple (a, B, y) such that 


Vn R(a(n), B(n)) & Wn 4 S(a(n), ¥(n)) 


and soaZA UA. 
Let p be the length of <[Y. Suppose a€A. Let y_ satisfy 
Vn - S(a(n), ¥(n)). Define 


$(B(n)) = (a(n), B(n), ¥(n)); 


794 MARTIN/ DESCRIPTIVE SET THEORY (cH. C.8, §2 


@ maps X,.r into Y in a one-one order-preserving manner. Hence the 
length of <[X..r is =p, and so A is Borel by the Lemma. O 


2.22. CoroLLary (to the first half of the proof, easily generalized). 2) and 
II, are closed under countable unions and intersections for n = 1. 


There are further relations between the projective hierarchy and the sets 
gotten by closing the open sets under Boolean algebraic operations. For 
these we need to consider uncountable unions and intersections. 


2.23. THEOREM (SIERPINSKI [1925]). Every A € %} is a union of &, Borel 
sets. 


ProorF (outline). We sketch the proof. The proof of the Souslin Theorem 
already shows that every A EI is a union of N; Borel sets: 


A= U {a: <! X..rz is well-founded with length < p}. 


p<wy 


2.24. Lemma. Every A € %; is a union of &, Borel sets. 


Proor. Let A ={a: 3B WnR(a(n), B(n))}. Let us order WNW by 
B:<1B.< B.A B2 and the least n with B,(n) 4 B.(n) satisfies Bi(n)< 
B.(n). It is easily checked that, for a € A there is a least B = B(a) such 
that Wn R(a(n), B(n)). For a€ A define 


Za ={y(n): y IB(a) & ¥(n) ¥ B(a)(n) & Vm <nR(a(m), y(m))}}. 
Clearly < is well-founded on Z,. Now define, for p < w,, 
A, ={a:a€ A& <[Z, has length <p}. 


Clearly A = U,-..,A,. An argument similar to that in the proof of the 
Souslin Theorem shows that each A, is Borel. O 


Now let A € 23. By Theorem 2.16 there is a relation R such that 


a€A 3p VyAnR (a(n), B(n), ¥(n)). 
But then, 
aEA sap (<lX.2R is well-founded), 


where X.,2,r has the obvious meaning. Then 
aEA FB dp < o,(<! X.,¢r is well-founded with length = p) 
dp <o,5B8 (<[X.2 is well-founded with length =p) 
dp <w,IB 3f: X.2r—p (f is order preserving). 
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By coding B and f into a single function y : w — @, we see that 
{a: 3B Sf: X.er—p(f is order-preserving)} € 2} 


for each p. Hence A is the union of N, elements of 2}. The Lemma implies 
the Theorem. O 


It is known that one cannot prove or refute from the ZFC axioms the 
converse of Theorem 2.23: that every union of N, Borel sets belongs to 3. 


3. Structural and regularity properties of pointclasses 


The structural properties of &} and %% we have derived so far are also 
properties of II} and IT}. (Theorem 2.23 is an exception.) But there are 
other, deeper structural properties for which this is not the case. (There are 
also some shallow properties for which it is not the case. For instance, 2, 
but not II? is closed under countable unions.) One of the fundamental 
empirical principles of descriptive set theory is the following: 

There is a collection P of interesting structural properties such that, if I is 
an important non-self-dual pointclass, then either I’ or I but never both has 
most or all of the properties in P. 

The proof that and F cannot both have a property in F is usually easy 
and depends only on general principles. The difficulty lies in showing that 
one of the classes has the property in question. Sometimes this difficulty 
amounts to impossibility if one is limited to the ZFC axioms. 


3.1. DeFinition. Reduction(I’) means that, for all A,B EF with A, BC 
AN", there are A’, B’E TI such that 


A'CA, B'CB, A'UB'=AUB, 
and A'/N B' is empty (see Fig. 1). 


A B 
A B’ 
Fig. 1. 


Reduction can never hold for both and I if I has certain basic 
properties. 
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3.2. THEOREM. If I" is a pointclass with a universal set and closed under 
continuous preimages, then Reduction fails for I or I. 


Proor. Let U be universal for [ and let a — (ao, a;) be the canonical 
homeomorphism of W onto WV’. Let 


A = {(a, B): (a0, B)E U}; BB = {(a, B): (an, B) E U}. 


If Reduction(I), let A’, B’ be as given by Reduction(I’). If Reduction(/*), 
let A*, B* be as given by applying Reduction(/") to A’, B’. A* = B*, so 
A*EIOF. If {B : (ao, B)€ U} is the complement of {8: (a,, 8B) € U} 
then (a, B)E A* (a, B)EA and (a, B)E B* (a, 8B) € B. Hence every 
CEI F is of the form {B: (a, B)E A*}. 

Now let 


aECe(a,ayeE A*. 


Since I’ is closed under continuous preimages, CE OT. But then for 
some a, C={B: (a, B) € A*}. Thus 


(a aEA*eaeECe(aa)€<A*. O 
A stronger property than Reduction is the following: 
3.3. DEFINITION. Prewellordering (I’) means that for every A € I there isa 
function @ : A > Ordinals and there are R, S in I’ such that, for B € A, 
aE A & b(a)<$(B)o(a, B)ER; 
aEA& $(a)= $(B)o(a, BES. 


3.4. THEOREM. Prewellordering(I’)— Reduction(I’), provided that I" is 
closed under finite unions, finite intersections, and continuous preimages. 


Proor. Let A,B ETI. Let 
Ci = {(a, B): a(0)=0 & BEA}; C2 = {(a, B): a(0)= 1 & BE B}. 


Let C= C,UC.,. By the closure properties, CE I. Let @ : C > Ordinals 
be as given by Prewellordering(I’). Set 


A'=AN{B: (1,8) ZC or (0, B)< (1, B)}; 


where 0 and 1 are respectively the functions which are identically 0 and 1. 
By the properties of @ and the closure properties, A’ and B’ belong to I. It 
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is easy to check that A’ and B' have the properties required by 
Reduction. 


Reduction is only one of a large collection of structural properties which 
follow from Prewellordering. 


3.5. THEOREM. For all p, 1= p < w,, Prewellordering (X?). 


Proor. Let A € ?. Then A = U,A; with each A; € AG. We may assume 
that i<j implies A; C A, Define, for each aE A 


(a) = the least i such that a € Aj. 
If BE A, then 
aEA & o(a)< O(B) (BE Ao & Vi(B € Aisi a € Ai); 
aEA& o(a)= O(B)CViI(BE Ai ma EA). 


The relations on the right belong to MM. O 
3.6. REMARK. For &{, the theorem fails for the reals. 
3.7. THEOREM. Prewellordering (II). 


Proor. Let R witness that A € TH, 
aEA«e <[X,.z2 is well-founded. 


Let $(a) be the length of <[X..2. If B © A, then 


aca & $(a)< O(B)o 
af lf: Xan Xar & (9 <7 f(o)<f(r)) 
& the empty sequence is not in the range of f], 


aE A& $(B)= o(B)oAflf: X12 Xan & (a <7 f(a) <f(r))]. 


To prove this, let d(a) = ¢(B), and define f(o) by induction on the length 
of o so that |f(c)|, =|o|.. f can be construed as a function from @ to », 
so both relations on the right belong to Xi. O 


3.8. THEOREM (Moschovakis). Prewellordering (II,,)—> Prewellordering (n+1). 
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Proor. Let A € Xh4,. For some B ETI, a€ A (38) (a, B)E B. Let 
@ : B > Ordinals be as demanded by Prewellordering(I1;,). For a € A set 


(a) = inf{d(a, y): (a, y) € B}. 


A routine computation shows that the required relations R and S 
exist. 


3.9. CoROLLARY. Prewellordering (X3). 


3.10. REMARK. The abstract notion of Prewellordering was probably first 
isolated by Moschovakis, though in some sense Prewellordering(II;) is a 
classical result. Reduction(II;) is a theorem of Kuratowski. 


Some of the structural properties of II} and %3 are not implied by 
Prewellordering. The best-known such property is Uniformization. We 
postpone the subject of Uniformization until Section 4, where its signifi- 
cance can be made more clear in terms of the effective theory. 

We consider the following three regularity properties. 


3.11. DEFINITION. (i) A has the Baire property if A differs from an open set 
by a set of the first category. A set is of the first category if it is disjoint from 
a countable intersection of dense open sets. 

(ii) A has the perfect subset property if A is countable or has a perfect 
subset. 

(iii) Definitions (i) and (ii) make sense in general spaces. For A C Reals, 
A is measurable if A is Lebesgue measurable. For A CW, we say A is 
measurable if A is measurable when construed as a set of irrationals. For 
A €2°, A is measurable if A is measurable with respect to the product of 
the measure on {0, 1} gotten by giving {0} and {1} measure 3. 


There exist sets not satisfying any of (i), (ii) or (iii). However: 
3.12, THEOREM. A € 2:—A< is measurable (Lusin [1917]), has the Baire 
property (Lusin and Sierpinski [1923]), and has the perfect subset property 
(Souslin ). 


3.13. CoroLtary. A € Ili— A is measurable and has the Baire property. 


Proor. These properties hold of A if and only if they hold of A. O 


cH. C.8, §4] EFFECTIVE DESCRIPTIVE SET THEORY 799 


3.14. CoroLLaRy. A € 2;—A is countable or of power 2”. 


3.15. CorotLary. AE X%i—-A is countable, has power N,, or has 
power 2”, 


Proor. Apply Theorem 2.23. 0 


4. Effective descriptive set theory 


The advent of effective descriptive set theory brought two main improve- 
ments to the classical theory. 

(1) The effective theory extends and refines the classical theory to give a 
genuine theory of definability, including a theory for reals as well as sets of 
reals. 

(2) The founders (principally Kleene) of the effective theory introduced 
superior notation (which we have been using) and a variety of results and 
techniques, including coding techniques, which greatly facilitate proofs 
even of assertions of the classical theory. 

Recall that a subset A of WV” belongs to &; for k odd just in case there is 
a relation R such that 


(@1,...,0)y ECA bod 
(*) <- 4B, VP: re dp. Vm R(a.(m), ry a@,(m), Bi(m),. . ., B(m)). 
For k even there is a similar representation, with VB, dm replacing 


Ap, Vm. 


4.1. Derinition. A CN" belongs to 2; for k odd just in case there is a 
recursive relation R such that (*) holds, where the notion of recursive 
relations among finite sequences is defined in the natural way from that of 
recursive relations on the natural numbers. The definition of %i for k even 
is similar. 


4.2. DeFINiTIon. If a&N, ACW", k odd, then A € 3i(a) just in case 
there is an R recursive in a such that (*) holds; similarly for k even. 


These effective concepts contain the classical concepts since 


L= LU h(a). 
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4.3. Derinition. I, =>). AL= =), M111. Similarly for Ma) and A\(a). 


The effective projective hierarchy makes sense also for subsets of w as 
well as subsets of WN. 


4.4, DEFINITION. Assume k odd for definiteness. If A Cw™ xX N™, then 
A © 3j just in case there is a recursive relation R such that 


(Qi, 2-65 Any 15-+-, Om EA O 


© 5B, VB.--- 3B. Vm R(ai,..., An, &:(m),..., &,(m), B(m),..., Be(m)). 


Other notions are defined similarly. The notion of a pointset is exterided to 
include subsets of w™ x W™ and the notion of a pointclass is extended so 
that a pointclass is a collection of pointsets closed under our canonical 
homeomorphisms and also under canonical recursive bijections w" > w”™. 

A refinement of the finite Borel hierarchy can also be defined for the 
effective theory. It is easily seen that aset A C N” belongs to 2%, for odd k, 
just in case there is a relation R such that 


(**) aE€A -dAa,Vaz---Fa,R(a(a), ai,..., Ax-1). 


where the a, range over natural numbers. 


4.5. DEFINITION. &p, Th, An, x(a), Ifa), An(a) are defined using (**) in 
the same way the effective projective pointclasses were defined using (*). 
Elements of %° are called semirecursive or recursively enumerable; ele- 
ments of Af are called recursive. 


With slight modifications, the effective notions can be defined: for_other 
Polish spaces such as 2° and the reals. As with the classical theory the class 
Xi tends to behave differently for spaces other than NW, and some of the 
results which follow are valid for that class only in \. 

The notion of Borel set also has an effective refinement, the notion of a 
hyperarithmetic set. We omit the definition, and content ourselves with 
remarking that w, is replaced by Church—Kleene w,, the least ordinal not 
the order type of a recursive wellordering of a subset of w. 


4.6. DeFinition. If a € NW” and I is a pointclass, a ET just in case 
{(a1, 42): a(a:) = a} ET. 


In other words, a function belongs to I’ if and only if its graph belongs to I. 
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All the theorems of classical descriptive set theory have effective 
refinements. We state the following results for the {% and Si, but the 
relativizations to 2;(a) and 2%4(@) also hold and imply the corresponding 
classical results. The proofs of the classical theorems are sufficiently 
effective to yield proofs of the effective theorems. Many of the theorems 
which follow are due to Kleene. (See KLEENE [1952] and [1955].) 

We first describe the effective analogue of a continuous function. In 
Section 1 we gave a base for the non-empty open subsets of W. We give w 
the discrete topology and take as a base for its non-empty open subsets the 
collection of all singletons. A basic open subset of w™' X NW is a product of 
basic open subsets of the factors. Let Bo, B,,... be an effective enumera- 
tion of the basic open subsets of w™ X NV” (i.e. an effective enumeration of 
the finite sequences of natural numbers which determine the basic open 
sets). 


4.7. DEFINITION. F: 0" xX N"2>w™ X N™ is recursive if 


{(k, G1, ~ ++) Any 1, . + +5 Ong): F(Kai, +5 Any 1, -- +5 Ong) € Be} E AY. 


4.8. THEoREM. If C = 3, II, Ab, &}, Th, or Ai, then I is closed under finite 
unions, finite intersections and recursive preimages. 


4.9. THEoREM. Let I be &1, 11%, 21, or Il), and let T be the corresponding &°, 
Tm, X31, or Mh. 

There is a U ET which is universal for T; there isa UC wx N with 
U ETL such that the sets of the form 


{B: (b, B) & U} 


are exactly the subsets of N which belong to IT. 


Proor. For the proof of the second half of the theorem, one uses the 
enumeration theorem of recursion theory to handle the case F= 7. O 


4.10. THEorREM. A°S 3°S A®,,; ALS SICA. 


The extension of the first part of Theorem 4.10 to transfinite levels, 
which we have not defined, also holds. 
The effective analogue of Theorem 2.20 is due to Kleene and asserts 


The class of hyperarithmetic sets = A}. 
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4.11. THEOREM. Prewellordering(2"). Prewellordering(Ili) Prewellorder- 
ing (II) —> Prewellordering (%1.41). 


4.12. CoroLiary. Prewellordering (=). 


The effective concepts are defined in terms of recursive function theory, 
but there are other, equivalent definitions. The language of arithmetic is 
the language in first order logic with function symbols for addition, 
multiplication and successor and a symbol for 0. 


4.13. DEFINITION. A set A of natural numbers is arithmetical if there is a 
formula ¢(x) of the language of arithmetic with only the variable x free 
such that 


nEAONE A[n] 


where N is the standard model of arithmetic. 
4.14. THEOREM. A is arithmetical o A € U, 3°. 


Hence U,, 5° consists just of those sets definable in arithmetic. The 
definition of arithmetical can be extended to subsets of w™ XV” and 
Theorem 4.14 remains true. 


The language of analysis is gotten from the language of arithmetic by 
introducing variables to range over functions from » to w and quantifying 
also over such variables. 


4.15. DEFINITION. A Cw” X N" is analytical if there is a formula ¢ of 
analysis such that 


(41°** Om O1°** An)EA AME O[ar- ++ An, 1° ++ On) 


where M is the standard model of analysis. 
4.16. THEOREM. A is analytical @ A € U, 5}. 


The theorem asserts that U, =) consists of just those sets definable in 
analysis. 

The effective theory allows one to raise a new kind of question: if a 
simply definable subset of N has a member, does it have a simply definable 
member? To make this question more precise, we give a definition. 
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4.17. DeFinition. I; is a basis for I, if for every non-empty AEN, 
A CW, there is an element of AN I\. 


Here are a positive basis result and a negative basis result. 
4.18. THEOREM (Kleene). Aj is a basis for %?. Ai is not a basis for TI. 


Proor. The first part of the theorem follows from the fact that A € 3} 3 A 
has an eventually constant member. For the second part let A CW, 
A =AiNWM. It can be shown that A EII}. Let a¢ A 36 ((a, B) € B) 
with B E1,=T1?. If (a,B)€ BAA}, then a EA}, a contradiction. O 


Often basis theorems can be improved by strengthening them to 
uniformization theorems. 


4.19, DeFinition. Uniformization(I;, I) holds if for every ACW’, 
A €T,, there is a BC A, B EJ, such that 
Va [3B (a, 8B) € A 38 (a, B)E B] 


and, for every a there is at most one B with (a, B) € B. (B is a cross-section 
or a choice function for A — see Fig. 2.) Uniformization([) @ 
Uniformization(I, I’). 


ala ears 


Fig. 2. 


In terms of the effective theory, Uniformization(I’) not only produces a 
basis for F, but also does so in a uniform way. But Uniformization makes 
sense for arbitrary pointclasses and is a classical notion, unlike the notion 
of basis. 


4.20. THEOREM (Kondo-Addison-Novikov). Uniformization(Il}). For all a, 
Uniformization (Ili(a)). 
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4.21. CoroLLary. Uniformization(2}). 
4.22. CoroL_ary. Uniformization(Il}); Uniformization(23). 
4.23. Coro.iary. A} is a basis for %}. 


Corollary 4.22 is due to Konpo [1938] improving a result of Novikov 
(Lusin and Novikov [1935]). The effective version (i.e., the theorem) was 
proved by Addison who ‘“‘effectivized”’ the earlier proofs. 

We sketch the proof of the theorem. Let 


A ={(a, B): Wy In R(&(n), B(n), 7(n)). 


Now let X.22 be defined as before and set |a|.., =|a|<'*=#". Let 
d : w — Seq be recursive. Let 


Avp=A 
Aone = Arm M1 {(a, B): VB’ (a, B') © Arn > B(n) = B'(n))}. 
Aans2 = Aanti A {(a, B): VB’ ((a, BIE Aan+vim | o(n)lap |= |d(n) Ia. 6° 


or $(n) E Xapr)} 
B= () A,. 


nEw 


Using ideas from the proof of Prewellordering(Il;), one can show that 
B Ell}. The definition of A2,. guarantees that for each a there is at most 
one B with (a, B) € B. If 3B ((a, B)€ A), then the definition of the A, 
produces a unique candidate B for (a, B) € B. Clearly for each n there isa 
Bn with (a, Bn) E Arse. If P(n)EXaar set o(P(n))= IP(n)laa. o is 
order preserving and so verifies (a, B) € B. 

The second half of the Theorem follows from the first half, using a IT; set 
universal for II}. Corollary 4.22 follows from the second half of the 
Theorem and the relativization of Corollary 4.21. 

To prove Corollary 4.21, let 


(a, B)E A Fy ((a, B, y)E C) 
with CEI}. By the Theorem, let D EIT! give a cross-section for C, 
construing (8, y) as an element of W. Define B by 

(a, B)E Bo Ay ((a, B, y)€ D). 


For Corollary 4.23, let A € 23. Let B = {(0, a): a € A}, where 0(n) =0 
for all n. Let C € 3} be as given by Corollary 4.21. Then the unique @ such 
that (0,a)€C is given by 
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a(n) =m 38 ((0,B)E C & B(n)= m) 
VB ((0, B) E C> B(n) = m). 


The first condition shows a € 33 and the second shows a E IL. 


5. The axiom of constructibility and large cardinal axioms 


Many basic question about projective sets cannot be settled on the basis 
of the ZFC axioms. It is consistent (if ZFC + there exists an inaccessible 
cardinal is consistent) that every projective set is measurable, has the Baire 
property, and has the perfect subset property (SoLovay [1970]). On the 
other hand, Gove [1938] points out that, if the axiom of constructibility 
holds, then there is a set in A} which is not measurable and does not have 
the Baire property, and there is a set in I} which does not have the perfect 
subset property. Whether or not the basic structural properties of I; and 
3, such as Prewellordering and Uniformization, hold for II}, &}, or neither 
for n =3 are similarly undecidable in ZFC. (This assertion should be 
weakened slightly. The consistency of, say, Prewellordering(II;) is still 
unproved except for odd n, and there only from strong hypotheses.) 
Problems about relations between higher projective classes and Boolean 
algebraic operations appear similarly undecidable. 

ADDISON [1959b] shows that most of the basic questions about projective 
sets can be answered by assuming the axiom of constructibility V = L. 

If V = L, there is a wellordering < of NY of order type w, which belongs 
to A}. This well-ordering has a further technical property called ‘‘good- 
ness”’. This makes it possible to prove the following theorems: 


5.1. THEOREM (Addison). For all n =2, V = L— Prewellordering(2,) and 
so Prewellordering (X:). 


ProoF (outline). Let A = {a: 3B (a, B) € B} with B EIT), n =1. For each 
B EW, set | 8| =the order type of the initial segment of 8 with respect to 
<. For a€A set 


$(a) = inf{|B|: (a, B) € B}. 
Using the ‘‘goodness’’ of <, one can define the desired R and S. O 


5.2. THEOREM (Addison). V=L— for every n=2, Uniformization(2,) 
and so Uniformization()}). 
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The proof proceeds by choosing least elements in the sense of <. 

V =L thus gives a fairly complete theory for projective sets. There are 
nevertheless two unpleasant aspects of this theory. As we have remarked, 
the regularity questions are all answered negatively. Furthermore, there is 
something odd about the sequence 


Xo Mi 22 23 Bares. 
Solovay has shown that, if one assumes large cardinal axioms in place of 


V=L, some regularity questions about higher projective classes can be 
settled positively. 


5.3. DEFINITION. A cardinal « is measurable if there is a function wu 
defined on the power set of « (x is construed as a Von Neuman initial 
ordinal) such that 


u(A)=0 or 1 for all A; 


u({e})=0 foralla<x; 
w(K) = 1; 
u(k—A)= 1-p(A); 


» ( U A) =0 if each 2(A,)=0 and I has power <x. 


ier 


5.4, DEFINITION. Let MC be the assertion that an uncountable measurable 
cardinal exists. 


It is known that uncountable measurable cardinals, if they exist, must be 
very large. For example, if « is an uncountable measurable cardinal, « is 
the «-th inaccessible cardinal. It is also known that MC does not follow 
from the ZFC axioms. Nevertheless, there are arguments — however 
inconclusive — to support the acceptance of MC and other large cardinal 
axioms. 


5.5. THEOREM (Solovay). MC — every element of &: is measurable, has the 
Baire property, and has the perfect subset property (SOLovAY [1969]). 


Theorem 5.5 and the further results which follow are suprising, since an 
assumption about very large cardinal numbers is used to settle questions 
concerned only with W or the reals. 
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Some further progress can be made on structural questions if one 
assumes MC. 


5.6. THEOREM (MANSFIELD [1971] improving a result of MARTIN and 
Sotovay [1969]). MC — Uniformization (Th, IT). 


5.7. COROLLARY (MartIN and Sotovay [1969]). MC— Aj is a basis for 35. 


Without much work one gets from the proof of the Corollary: 


5.8. THEOREM (Mar tIN [1977]). MC every element of %} is a union of &; 
Borel sets. 


5.9. CoROLLARY. Every element of 23 has power <> or power 2™. 


Unhappily, MC does no more work. It is consistent with MC (Silver) that 
there is a set in A} which is not measurable and does not have the Baire 
property and that there is a set in I} which does not have the perfect subset 
property. Furthermore MC seemingly does not allow us to settle whether 
Uniformization, or Prewellordering, or neither holds for 2%} or II5. These 
results, mostly in SiLveR [1971], are gotten by considering L[w], the class 
of sets constructible from a ux which witnesses that some x is measurable. 
Silver shows that in L[y] there is a good wellordering of Y which belongs 
to Ai. He uses this wellordering just as Gédel and Addison use the 
wellordering < defined earlier. 


6. Projective determinacy 


There is little direct relation between structural properties and those 
regularity properties we have discussed. In GALE and Stewart [1953] a 
new regularity property was introduced: the property of determinacy. 
Results in MyciELski and SwiERczkowsk! [1964] and Davis [1964], along 
with an old result of Banach, indicated that determinacy is stronger than 
the usual regularity properties in that it implies them all. Later, even more 
surprising results showed that determinacy implies even structural proper- 
ties. In recent years it has become the custom to assume the determinacy of 
projective sets as an hypothesis and to deduce a theory of projective sets 
from this hypothesis. 
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6.1. DeFinition. Let A CW. Following GALE and Stewart [1953], as- 
sociate with A a 2-person infinite game of perfect information @,. Player I 
begins the game by choosing m € w; then Player II chooses n; € w; then I 
chooses n, € w; and so on. Let a(i) = n,. I wins G, just in case a € A. The 
notions of a strategy and of a winning strategy for I or H for @, are defined 
in the obvious way. We say that 4, is determined if one of the players has a 
winning strategy. By Determinacy(I’) we mean the assertion that 4, is 
determined for every A € I. Projective determinacy (PD) is the assertion 
Determinacy(U,  }). 


6.2. THEOREM (Kechris, Martin improving results of Banach, Mycielski and 
Swierczkowski, and Morton Davis). Determinacy(Z;) — every set in iv 
is measurable, has the Baire property, and has the perfect subset property. 


Just as with other regularity properties, one can prove determinacy for 
sufficiently simple classes, and MC makes possible a proof of determinacy 
for wider classes. 


6.3. THEOREM (MarTIN [1975, 1970]). Determinacy (Ai). If MC, then De- 
terminacy (II) (<* Determinacy (%})). 


The proof of Determinacy(A)}) is quite different from the proofs of the 
other regularity properties of Borel sets. In other cases, the proofs can be 
carried out in the usual formal theory of w and its power set. FRIEDMAN 
[1971] shows that Determinacy(A;) cannot be proved in this theory and 
indeed cannot be proved in Zermelo set theory (set theory without the 
Axiom of Replacement). Hence, although Determinacy(A)}) is a statement 
about w and NW only, its proof involves in an essential way transfinite 
iterations of the power set operation. 

Determinacy(I1;) cannot be proved from the ZFC axioms, and even MC 
is not sufficient to prove Determinacy(A3). 

BLACKWELL [1967] showed that alternate proofs of some of the structural 
properties of IIi could be given using the Gale-Stewart theorem 
Determinacy(?) (GALE and Stewart [1953]). Addison and the author 
independently conceived of assuming PD and applying the method of 
Blackwell to higher projective classes. Moschovakis (ADpIsoN and Mos. 
CHOVAKIS [1968])} and MARTIN [1968] independently proved the following 
theorem: 


6.4. THEOREM. If PD, then Prewellordering(&1) — Prewellordering (II;.+1). 
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6.5. CorRoLLary. If PD, then Prewellordering(II.) for all odd n and 
Prewellordering(&,) for all even n. 


6.6. Corottary. If PD, then Prewellordering (MM) for all odd n and 
Prewellordering (1) for all even n. 


Corollary 6.5 follows from Theorems 6.4 and 3.8. 


PROOF OF THEOREM 6.4. We give the proof from Appison and Mos- 
cHovakis [1968]. Let A ={a:Vy(a,y)€B} with BEX. Let 
¢ : B > Ordinals be given by Prewellordering(,). For each B € A and 
a © WN consider the following game (a, B): I and II choose no, m,... by 
moving alternately. Let y(i) = na and 6(i) = nai+1. I wins if 


aZA or o((a, y))>((B, 5)). 
We define a relation a = B by 
a = 8 <I has a winning strategy for G(a, B). 


Since G(a, B) = G- for projective C, PD implies that G(a, B) is deter- 
mined. This justifies the following, for B € A: 


a= Bods Vy (s isa strategy for II & 

(a, y)€ B& $((a, y)) = 6((B, s(y)))). 
B#aoAsVy(s is a strategy for I & 

(a, y)E BE& $((a, y)) < ((B, s(y)))). 


The expressions on the right define relations in 2441. Also, a < a, since II 
can simply choose 6 = y. 

If a4 B, then I has a winning strategy s for G(a, 8B). Hence a A or I 
has a winning strategy s’ for G(B,a) defined by s'((m,...,m))= 
s({n,,...,Mx-1)). This shows that, for a and BEA, 


a#B D> B&a. 


Suppose a = 6B = y. Let s, and s2 be strategies for II witnessing a = B 
and B < y respectively. Let s; = s.s,. Then s3 is a strategy for II witnessing 


assy. 
Say a > B oa B. Suppose ap > a, > a2 > +++. Let So, 5,,... be winning 
strategies for I for the games @(ao, a,), (a1, @2),... respectively. We © 


produce simultaneously plays of all the @(a, a:-1) by letting I’s moves in - 
G(aj, a:.1) be as dictated by s, and II’s moves in G(a,, a;.,) be ’s moves in 
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G(aj.1, @+2). The resulting play yields a sequence yo, y1,... such that each 
(ai, yi) E B and 


$((a0, y)) > &((a1, 1) > °° - 


This is a contradiction. 
We have shown that <= is reflexive, connected and transitive and the 
associated < is well-founded. Hence there is a » : A —> Ordinals such that 


a=Bop(a)=4(B). O 


The problem of proving Uniformization(II,) for all odd n from PD was 
open for some time. Then Moschovakis formulated and proved a stronger 
result. 


6.7. DEFINITION. Scale(I") means that, for every A EI, A CW, there is a 
sequence q,, R;, S, i=0,1,2,..., such that the following hold. 


(i) {ia,B): (a BYERJET; f{(i,a,B): (a, BVESSETL. 


(ii) Each A, ¢, Ri, S; has the properties demanded of A, ¢, R, S in the 
definition of Prewellordering (Definition 3.3). 

(iii) Suppose a, © A, n=0,1,2,..., a =lim,a,, and for every i the 
sequence ¢;(ao), i(a@:),... is eventually constant. Then 

(a) a@EA; 

(b) Vi ($i (a) = lim, ¢(a, )). 


6.8. THEOREM (MoscHovakis [1971]). Scale(IIi) > Scale(%i.:). If PD, 
then Scale(%,) — Scale (I1;+1). 


The proof of the more difficult second part is by an extension of the 
methods of the proof of Theorem 6.4. 


6.9. COROLLARY. Assume PD. Scale(II,) and Scale(IN,) for n odd; 
Scale(Z,) and Scale(%,) for n even. 


Corollary 6.9 follows from Theorem 6.4 and the easily proved Scale (=‘). 
6.10. CoROLLARY. Assume PD. Uniformization(II,) and Uniformiza- 
tion(II,) for n odd; Uniformization(2,) and Unformization(&,) for n even, 


n>0. 


Proor. The proof of Corollary 6.10 is exactly like that of Theorem 4.20, 
with ¢,(a@, 8) playing the role played there by |é(n)|..8. O 
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6.11. Coro.iary. If PD, then A}, is a basis for &} for all even n. 


From Theorems 6.4 and 6.8, and a further result of MoscHovakis [1973] 
proved by a similar method, researchers have deduced an almost complete 
structural theory for the projective hierarchy. Projective determinacy is 
then at least equal to V=L in strength with respect to questions about 
projective sets, and it is more pleasing in that (1) regularity questions are 
answered positively and (2) the sequence 


Xo Mi 22 13 --- 


seems more plausible than that derived from V = L. 
PD also sheds light on the question of the relation between projection 
and Boolean algebraic operations. 


6.12. DEFINITION. 8), is the least ordinal not the length of a well-founded 
relation on W belonging to A}. 


6.13. Derinition. If « is a cardinal number, A C W is «-Souslin if there is 
a relation R such that 


a€ASSf:w—>K VnR(a(n), f(n)). 


6.14. THEOREM. A is No-Souslin > A EX}. AEX—A is N,-Souslin 
(SHOENFIELD [1961]). If MC, then A € %3— A is N2-Souslin (Martin, based 
on Martin and Sotovay [1969]). If PD, then A © Yinsz—> A is x-Souslin 
where x is the cardinal of 8},.; (Moschovakis). If PD, then A € Yinsi > A 
is x-Souslin for some xk < 83,4, (Moschovakis). 


For &i, the result is by Theorem 2.16. The proof of Theorem 2.23 gives a 
representation of sets in £3 as N,-Souslin. The last two sentences of the 


Theorem follow from Theorem 6.8. 


6.15. THEOREM (Kunen, Martin [1977]). Every «-Souslin well-founded 
relation has length of power <k, if x is infinite. 


6.16. COROLLARY. 63 @2 (MARTIN [1977]). MC 63 S$ ws. PD—> 81S wa. 
6.17. CoroLLary. If PD, then A €Xi—A is a union of Ns; Borel sets. 


Some of the most interesting open questions regarding determinacy and 
projective sets concern the 6}. What are upper bounds for 8}, n =5? 
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Kunen has some partial results for n = 5. Are the 8; cardinals? The latter 
question is not a purely technical one, since the continuum hypothesis 
implies that, for n =2, 0, <6,< wr. 

As a final application of projective determinacy, we mention the Wadge 
ordering. 


6.18. DEFINITION. If A,B CNV, A SwB just in case there is a continuous 
f:N—WM such that f(A)C B and f(A) CB. 


6.19. THEOREM (W. Wadge). Assume PD. If A and B are projective, then 
A <wBorBswA. 


Proor. Consider the game G defined as follows. Let a be I’s play and let B 
be II’s play. II wins just in case 


aEAoBpeB. 


A winning strategy for IIT witnesses A <wB. A winning strategy for I 
witnesses B <y A. 


6.20. Derinition. If A,B CW, A~wB if (A swB and BswA) or 
(A =wB and B <wA). The Wadge degree of A =[A]={B: A ~w B}. 
[A]<[B]<@A <=w~B and A xwB. 


6.21. THEOREM (Martin). If PD, then the Wadge degrees of projective sets 
are well-ordered by <. 


Remark. Theorem 6.19 already gives that the Wadge degrees are linearly 
ordered. 


The Wadge degrees are thus an ultimately fine hierarchy on the 
projective sets, assuming PD. If more determinacy then PD is assumed, 
this Wadge hierarchy extends correspondingly further. For Borel sets, no 
determinacy assumption is required, by Theorem 6.3. 

Call a Wadge degree [A] self-dual if A <w A. Steel has shown from, say, 
PD that Reduction must hold on one side or other of any, say, projective 
non-self-dual Wadge degree closed under countable unions and intersec- 
tions. This helps to explain the empirically observed pervasiveness of the 
Reduction phenomenon. 

We now discuss briefly some of the consequences of a demonstrably false 
proposition. 
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6.22. DEFINITION. The axiom of determinacy (AD) is the assertion that @, 
is determined for every A. 


AD was proposed by Mycielski and Steinhaus. See MycieE ski [1964] for 
more information on AD. AD contradicts the axiom of choice and so is not 
a genuine candidate for an axiom of set theory. Nevertheless consequences 
of AD are worth noting because AD may hold, for example, in the realm of 
sets ‘ordinal definable from reals. 

AD implies that all sets have all important regularity properties. For 
example, AD implies that every set is measurable (MycilELsk! and 
SWIERCZKOwSKI [1964]), has the Baire property, and has the perfect subset 
property (Davis [1964]). 

AD implies that the collection of all Wadge degrees is well-ordered, and 
thus the entire power set of the continuum is arranged in a hierarchy of 
increasing complexity. 

In the area of projective sets, AD has some very appealing conse- 
quences, consequences mostly inconsistent with the axiom of choice. AD 
implies that the 8, are all cardinals (MoscHovakis [1970]), in fact, measur- 
able cardinals (Kunen, Martin, Solovay). 


6.23. DeFinition. If « is a cardinal B, is the smallest Boolean algebra 
containing the open sets and closed under well-ordered unions and 
intersections of length <k. 


6.24, THEOREM (MartIN [1977], MoscHovakis [1971]). AD — Bs: = A, for 
all odd n. 


For n = 1, the conclusion is just the Souslin Theorem 1. A) C Bs: for odd 
n is a consequence of PD and so apparently is not inconsistent with the 
axiom of choice. Even A} D Bg: is inconsistent with MC plus the axiom of 
choice. Thus AD gives a more elegant descriptive set theory than is 
possible with the axiom of choice. This is even more striking when one 
considers &%, defined by projection on the power set of W. AD gives a very 
pleasant theory of these classes, and no significant structural theory 
consistent with the axiom of choice is known. 

Is PD true? It is certainly not self-evident. Some set theorists consider 
large cardinal axioms self-evident, or at least as following from a priori 
principles implied by the concept of set. Weak forms of PD (Determi- 
nacy(II})) follows from large cardinal axioms. It is possible that PD itself 
follows from large cardinal axioms, but this remains unproved. 
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The author regards PD as an hypothesis with a status similar to that of a 
theoretical hypothesis in physics. Three kinds of quasi-empirical evidence 
for PD have been produced. 

(1) The mere failure to refute such a powerful assertion is some evidence 
for its truth. (2) Special cases of PD have been verified: Determinacy(A)). 
(3) The consequences of PD in the realm of descriptive set theory are so 
plausible and coherent that they lend plausibility to the principle which 
implies them. 
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The mathematician constructs proofs. The proof-theorist considers 
proofs themselves as mathematical objects and studies them with the tools 
of modern mathematics. 

Proof theory began with Hilbert’s Program. Hilbert’s idea was to exploit 
the concrete, finitistic nature of proofs to provide a simple foundation for 
mathematics. Gédel’s Incompleteness Theorems dealt the program a 
staggering blow. These theorems are among the most basic in logic and are 
discussed in Smorynski’s introductory chapter. 

It was much harder to make a reasonable selection of topics from proof 
theory than from the other parts of logic because the subject is going in 
many different directions. We attempted to select topics which illustrate 
the parts of proof theory which are well developed and are relevant to 
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other parts of logic and mathematics. Reading the chapters that resulted 
one sees certain threads which run through them and seem to be central 
concerns of present-day proof theory. We mention two. 

One of the most important threads holding proof theory together is the 
question: What more do we know about a true statement when we learn 
that it is provable in a particular theory, or has a proof in some normal 
form? Another is the search for relationships between mathematical and 
proof-theoretic principles. Both of these aspects of proof theory are 
beautifully illustrated in Schwichtenberg’s chapter. He shows the reader 
how Gentzen’s method of cut-elimination can be used for a wide variety of 
such studies. 

The first thread mentioned above is central to Statman’s chapter. He 
presents a case study of what more one can learn from a direct proof of a 
theorem in a formal equation calculus than one learns from an indirect 
proof. 

Feferman’s chapter looks at some formal theories of higher-type 
mathematical objects, functions on natural numbers, sets of such functions, 
etc. These theories must, by Gédel’s Theorem, be incomplete but empirical 
evidence shows that they embody natural mathematical principles and are 
strong enough to carry out large portions of mathematics. Feferman studies 
the above questions as well as the relative strength of the various theories 
by means of independence results. 

Troelstra’s chapter discusses constructive notions of proof. Consider, for 
example, a proof of » v w (“‘¢ or w’’). A classical mathematician accepts a 
proof of g@ v & even when the proof does not tell him which of the two is 
actually true. (Take g v —@ for example.) The constructivist demands 
more. For him a proof of ¢ v # must contain either a proof of ¢ or a proof 
of w and he must be able to tell which. Beyond such simplicities, however, 
lie highly problematic matters. The study of constructive proofs gives rise 
to several competing, and mutually incompatible, views of constructive 
mathematics, intuitionism being one of the best known. 

Fourman’s chapter discusses the relationship between intuitionism and 
the notion of topos from category theory. The Kock-Reyes chapter 
(Chapter A.8) gave the category-theorist’s view of parts of logic. Here we 
get the logicians view of parts of category theory. The Barendregt chapter 
discusses recent progress in the old search for a coherent theory of 
self-applicable functions. 

The Handbook comes to a dramatic close with the Paris—Harrington 
chapter which discusses a true statement of finite combinatorics which is 
not provable in Peano arithmetic. 
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1. Hilbert’s Program 


Mathematics, at the turn of the century, was plagued by various 
difficulties ranging from antinomies and paradoxes to inconsistencies both 
formal and personal. There had been difficulties earlier in mathematics, but 
these had been removed or detoured: The Greeks eventually shrugged 
their shoulders and admitted irrational numbers; the analysts avoided 
paradoxes involving infinitesimals by finally isolating and rigorizing the 
concepts of limit and continuity. Even in set theory, the solution to the 
problem of the paradoxes had been offered as early as 1908 by Zermelo: 
One must first know what one is talking about before one can axiomatize a 
subject. Thus, instead of taking as axioms for set theory some intuitively 
obvious properties of finite sets, some obvious properties of the set of all 
subsets of a given set, and yet some other obvious properties of a third 
entity — a process that almost guarantees contradictions — Zermelo first 
described the cumulative hierarchy and then listed axioms for this single 
entity. Until recent work on large cardinals, axioms later added were 
merely further properties obviously true for this hierarchy but which were 
formally underivable. 

Sociologists would describe what transpired next in terms of ‘‘culture 
lag’. Despite the fact that a consistent set theory was available, mathemati- 
cians continued to worry about consistency. Some even doubted the 
consistency of arithmetic itself! To make matters worse. L.E.J. Brouwer 
was making the rounds in a bizarre attempt to turn mathematics into a 
religion. 

When, in 1920, Hermann Weyl fell prey to Brouwer’s lunacy, David 
Hilbert decided to intervene. He observed that (Re1D [1970] p. 155) ‘‘What 
Weyl and Brouwer do comes to the same thing as to follow in the footsteps 
of Kronecker! They seek to save mathematics by throwing overboard all 
that which is troublesome... . They would chop up and mangle the science. 
If we would follow such a reform as the one they suggest, we would run the 
risk of losing a great part of our most valuable treasures!” 

The vehemence with which Hilbert made the above declaration is most 
readily understood when one remembers that Hilbert made his name by 
the use of non-constructive techniques. His solution to Gordon’s problem 
in the theory of invariants (REID [1970], Chapter V) had elicited the charge 
of “theology” from Gordon. Kronecker refused to believe that the 
theorem, which asserted the existence of objects satisfying some condition, 
had been proven as the objects had not been explicitly constructed. 
Lindemann called the technique “‘unheimlich’’. Thus, it is not surprising 
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that Hilbert continued (Rem [1970], p. 157) “I believe that as little as 
Kronecker was able to abolish the irrational numbers ... just as little will 
Weyl and Brouwer today be able to succeed. Brouwer is not, as Weyl 
believes him to be, the Revolution — only the repetition of an attempted 
Putsch”’. 

Even if Hilbert had faith in Zermelo’s set theory, he could not use it: 
For, he had not to secure mathematics but to stop a Putsch. So Hilbert 
proposed his Conservation Program: To justify the use of abstract tech- 
niques, he would show — by as simple and concrete a means as possible — 
that the use of abstract techniques was conservative — i.e. that any 
concrete assertion one could derive by means of such abstract techniques 
would be derivable without them. 

To clarify these matters, we introduce some Hilbertian jargon whose 
exact meaning was never delineated by Hilbert. First, in the domain of 
concrete mathematics, there are finitistically meaningful statements and 
finitistic means of proof. The finitistically meaningful statements are called 
real statements and are (say) identities of the form 


Wx (fx = gx), 


where f, g are reasonably simple functions (e.g. primitive recursive). 
Finitistic proofs correspond roughly to computations or combinatorial 
manipulations. More complicated statements are merely ideal ones and, as 
such, have no meaning; but they can be manipulated abstractly — just as i 
is not a real number, but can be dealt with algebraically, freely using the 
fact that i? = — 1. Hilbert’s contention was that, just as the use of i leads to 
no new algebraic identities, the use of ideal statments and abstract 
reasoning about them would not allow one to derive any new real 
statements — i.e. none which were not already derivable finitistically. To 
refute Weyl and Brouwer, Hilbert required that this latter conservation 
property itself be finitistically provable. 

To avail itself of a finitistic treatment, the ideal statements and abstract 
reasoning would have to be codified in some formal system. Then the 
abstract reasoning would be codified by simple combinatorial manipula- 
tions and similar simple combinatorial manipulations could be used to 
demonstrate this conservation. 

At this point, one could try to analyze either the reasons for Hilbert’s 
belief that this could be done or the assumptions that necessarily underly 
such a Program. The author does not find these topics particularly 
interesting and so we skip them. 

The question probably on the reader’s mind is: This is all very nice, but 


824 SMORYNSKI/THE INCOMPLETENESS THEOREMS {cH. D.1, §1 


where does consistency come in? For, as everyone knows, this chapter 
is supposed to be about consistency. Hilbert’s Consistency Program is a 
natural outgrowth of and successor to Hilbert’s Conservation Program. 
There are two reasons for this: 

(i) Consistency is merely the assertion that some string of symbols is not 
derivable. Since derivations are simple combinatorial manipulations, this is 
a finitistically meaningful statement and ought to have a finitistic proof. 

(ii) Proving consistency of the formal system encoding the abstract 
concepts already establishes the conservation result! 

Reason (i) is straightforward and we do not discuss it. Reason (ii) is 
particularly important and we should comment on it. Let R, I denote 
formal systems encoding real statements with their finitistic proofs and 
ideal systems with their abstract reasoning, respectively. Let g be a real 
statement Vx (fx = gx). Now, if It y, then there is a derivation, d, of » 
from I. But, derivations are concrete objects and, for some real formula 
P(x, y) encoding derivations in I, 


Rt P(d,'g'), 


where 'g! is some code for y. Now, if g were false, one would have fa # ga 
for some a and hence, 
Rt P(c,'4 9!) 


for some c. In fact, one would have the stronger assertion 
Rt fx ¥ gx > P(c,,'¢!') 
for some c, depending on x. But, if R proves consistency of I, we see 
Rt 7 (P(d,'g')a P(c,'7¢')), 


whence Rt fx = gx, with free variable x, i.e. Rt Vx (fx = gx). 

[The above argument is a bit vague and is rife with additional assump- 
tions. To make it rigorous, we would have to get down to the basics of 
encoding — which is more than we intend to do in this section. The 
assumptions on P are brought out in Sections 2 and 3. A formal version of 
the above argument appears in Section 4.] 

The argument of the above paragraph clearly invited Hilbert to establish 
his Consistency Program: To devise a finitistic means of proving the 
consistency of various formal systems encoding abstract reasoning with 
ideal statements. 

Since the Consistency Program was as broad as the general Conservation 
Program and, since it looked more tractable, Hilbert fixed on it, asserting 
(MEscHkowsk! [1973], p. 56): 
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If the arbitrarily given axioms do not contradict each other 
through their consequences, then they are true, then the 
objects defined through the axioms exist. That, for me, is 
the criterion of truth and existence. 


In summary, Hilbert’s Consistency Program had as its goal the proof, by 
finitistic means, of the consistency of strong systems. The solution would 
completely justify the use of abstract concepts. The proof would success- 
fully repudiate Brouwer and bring Weyl back into the fold. 

It’s a shame that it couldn’t work. 


2. Goddel’s theorems 


In 1930, while in his twenties, Kurt Godel made a major announcement: 
Hilbert’s Consistency Program could not be carried out. For, he had 
proven two theorems which were then considered moderately devastating 
and which still induce nightmares among the infirm. Loosely stated, these 
theorems are: 


First INCOMPLETENESS THEOREM. Let T be a formal theory containing 
arithmetie. Then there is a sentence o which asserts its own unprovability and 
is such that: 

(i) If T is consistent, T ¥ ¢. 

(ii) If T is w-consistent, T ¥ — @. 


SECOND INCOMPLETENESS THEOREM. Let T be a consistent formal theory 


containing arithmetic. Then 
T ¥ Cony, 


where Cony is the sentence asserting the consistency of T. 


The Second Theorem clearly destroys the Consistency Program. For, if 
R cannot prove its own consistency, how can it hope to prove the 
consistency of f? (R and I are as in Section 1.) Even the First Theorem does 
this since (1) the statement 9 is real; and (2) ¢ is easily seen to be true. ((1) 
requires looking at the construction of @; (2) is seen by observing that ¢ 
asserts its unproyability and is indeed unprovable.) Thus, the First 
Theorem shows that the Conservation Program cannot be carried out and, 
hence, that the same must hold for the Consistency Program. 

Let us consider the proofs of these remarkable theorems. 
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2.1. Preliminaries 


The clause in each theorem that T contain arithmetic is just a means of 
avoiding the problem of stating explicitly what conditions must be met. 
These conditions are encodability conditions and, as Gédel showed, one 
can do a great deal of encoding on natural numbers. We defer until Section 
3 the discussion of how the encoding is handled and discuss here what is to 
be encoded and where it is to be encoded. 

Throughout this chapter, T will be some fixed, but unspecified, consis- 
tent formal theory. For later convenience, we assume that the encoding is 
done in some fixed formal theory S and that T contains §. We do not 
specify S — it is usually taken to be a formal system of arithmetic, although 
a weak set theory is often more convenient. The sense in which § is 
contained in T is better exemplified than explained: If S is a formal system 
of arithmetic and T is, say, ZF, then T contains S in the sense that there is a 
well-known embedding, or interpretation, of S in T. It is this sort of 
embedding that we have in mind. 

Since encoding is to take place in S, it will have to have a large supply of 
constants and closed terms to be used as codes. (E.g. in formal arithmetic, 
one has 0,1,... .) S will also have certain function symbols to be described 
shortly. 

To each formula, g, of the language of T is assigned a closed term, 'g!, 
called the code of g. [N.B. If yx is a formula with free variable x, then ‘px! 
is a closed term encoding the formula gx, with x viewed as a syntactic object 
and not as a parameter.] Corresponding to the logical connectives and 
quantifiers are function symbols, neg, imp, etc., such that, for all formulae 
g, w&, Stneg('g')='a¢!, Stimp('e!, 'b') ='o owl, etc. 

Of particular importance is the substitution operator, represented by the 
function symbol sub. For formulae gx, terms ¢ with codes 't', 


St sub('ox!, 't') = 'or!. 
Iteration of sub allows one to define subs, sub,,..., such that 
St sub, (fox: ++ xn), 't)', 2.2, te!) = "othe +t). 


Finally, we also encode derivations and have a binary relation 
Prov;(x, y) (read ‘‘x proves y”’ or ‘x is a proof of y’’) such that for closed 
ti, t2: St Prove(ty, t2) iff t, is the code of a derivation in T of the formula 
with code ft. It follows that T+ g iff St Provz(t, 'g') for some closed term ¢. 

If one defines 


Pr+(y)< Ax Provr(x, y), 
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then one obtains a predicate asserting provability. However, it is not 
always the case that 


(*) Tro iff StPrz('y'), 


unless §S is fairly sound (a term to be defined later). The reason is that the 
existential quantifier in Pry makes it essentially an ideal statement: While a 
consistent theory cannot prove false real statements, Vx (fx = gx), it can 
prove false existential ones, dx (fx = gx). Thus (*) can fail. 

The above encoding can be carried out, however, in such a way that the 
following important conditions are met for all sentences 9g, 


D1 Ttg implies StPr(‘¢'). 
D2 St Pre('g!)—> Pry('Pre(‘p')!). 
D3 St Prr(‘e!) a Pre('g > &')—> Pro(‘y'). 


Conditions D1-D3 are called the Derivability Conditions. 


2.2. Proof of the Incompleteness Theorems 
The Incompleteness Theorems depend on the following. 


2.2.1. THEOREM (Diagonalization Lemma). Let gx in the language of T 
have only the free variable indicated. Then there is a sentence ys such that 


Stpog('h!). 


[N.B. If @ or & is not in the language of S, then by “St ---’’, we mean 
that the equivalence is proven in the theory S’ in the language of T whose 
only non-logical axioms are those of S. S’ is conservative over S.] 


Proor. Given x, let 0x < y(sub(x, x)) be the diagonalization of g. Let 
m ='@x!' and w = 6m. Then we claim 


Styog('y!). 
For, in S, we see that 


6m — o(sub(m, m)) 
< g(sub('@x!,m)) (since m = '6x') 
= ¢('om')oe(b'). 0 


We apply 2.2.1 to 4 Pr+(x). 


2.2.2. THEOREM (First Incompleteness Theorem). Let Tt g <7 Pry('g'). 
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Then: 
(i) T¥ g; 


(ii) under an additional assumption, T ¥  g. 


Proor. (i) Observe Tt g implies Tt Prr('y'), by D1, which implies 
Tt — , contradicting the consistency of T. 

(ii) The additional assumption is a strengthening of the converse to D1, 
namely T+ Prx('g!) implies Tt 9. 

We have T+ 4g, hence T+ 7 Pry(‘g') so that Tt Prx('g!) and, by the 
additional assumption, T+ g, again contradicting the consistency of T. O 


2.2.3. THEOREM (Second Incompleteness Theorem). Let Cony be 
—Pr,('A'), where A is any convenient contradictory statement. Then 


T¥ Conr. 


Proor. Let g be as in the statement of Theorem 2.2.2. We show: 
St g Cony. 

Observe that Stg—>—Pr-('g') implies St+g—>—Prx(‘A!), since 
TtA—>@ _ implies StPrr('A—g'), by D1, which implies 
St Prx(‘A')— Prr(‘p!), by D3. 

But g > —Pr-('A!) is just @ — Cony and we have proven half of the 
equivalence. 

Conversely, by D2, St Prz('o!)—>Prr('Prr(‘g')'), which implies 
St Prr('e')—> Prx(‘4 ¢!), by D1, D3, since g #7 Prr('g!). This yields 
St Prr('o')>Prr('g a—¢'), by D1, D3, and logic, which implies 
St Pry('g')—> Prx(‘A'), by D1, D3, and logic. By contraposition, 
St —Pr('A')> 4Prz('g'), which is St Conr— 9, by definitions. O 


2.2.4. COROLLARY. S+ Conr— Congssconz: 


Proor. By the proof of Theorem 2.2.3, 
(i) St Cony 7 Prx('¢!), 
(ii) St Conz@ g. 
Using D2, D3, it follows that StConr——Prr(‘Conr'), so that 


S+ Conr—> 4 Prz('4 Con; A!), 
which gives St Cony— Conr+scony- O 


Corollary 2.2.4 is the Formalized Second Incompleteness Theorem. 
Let us finish this exposition of the proofs with two remarks: 
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2.2.5. REMARK. By the proof of the Second Theorem, the self-referential 
sentence which asserts its own unprovability is equivalent to the sentence 
asserting consistency. Hence, this sentence is unique up to provable 
equivalence and one may correctly speak of the sentence that asserts its 
own unprovability. 


2.2.6. REMARK. If the reader compares the loose statement of the First 
Incompleteness Theorem given earlier with that of Theorem 2.2.2(ii), he 
will notice that we dropped the reference to w-consistency. We will discuss 
this concept in Section 4.2. 


2.3. Things to come 


Except for the discussion of the mechanics of the encoding, we have 
finished proving the Incompleteness Theorems. This seems to be a good 
place to insert a brief description of the sequel. 

In Section 3, we discuss the encoding and some related topics. Section 4 
concerns metamathematical properties other than consistency and presents 
some generalizations of the Incompleteness Theorems. In Section 5, we 
present two applications of the notions and results of Sections 2 and 4. The 
Incompleteness Theorems are obtained by formalizing Syntax — in Section 
6, we discuss what happens when one formalizes Semantics. 


3. Encoding 


The details of an encoding are fascinating to work out and boring to 
read. The author wrote the present section for his own benefit and his 
feelings will not be hurt if the reader chooses to skip it. 

In expositions, one often replaces precise statements by imprecise ones, 
or by precise but false ones. As an example of the latter, it is commonly 
asserted that one proves the Second Incompleteness Theorem by formaliz- 
ing the proof of the First. A more correct statement would be that one 
formalizes D1 by D2 and then reduces the Second Theorem to the First. In 
Section 2.1, we have been guilty of cheating in two places: 

(i) in our vague formulation of the sense in which T contains S, and 

(ii) in our remark on sub. 

We discuss (i) now and (ii) in 3.2.2. 

In Section 2.1 we blithely remarked that T contains S in the sense that 

there is the same sort of embedding of S into T as there is of arithmetic into 
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set theory. In ordinary mathematical practice, the details of an embedding 
can be important: Is it continuous? A homomorphism? The same holds 
here. We must know, e.g.; what we mean by Pry('Prz(‘g')'), where ¢ is in 
the language of T and Pr; in that of S. 

We propose to sweep all of the difficulties under the rug by strengthening 
the assumption to: 

(i) the language of S is contained in that of T; 

(ii) the axioms of S are among those of T. 
While (i) and (ii) will make life easy for us in general, they will really grease 
the wheels when we discuss D2. 

The usual cases considered do not satisfy these conditions; but, if one 
defines T’ to be the conservative extension of T by the addition of the 
symbols and axioms of S, one can usually show 


(*) St Wx [Prr(x) @ Prr(x)], 


thus reducing the case in question to the one treated here. Theories in 
which (*) does not hold are pathological and we are not interested in them. 

Now that this is settled, we make some additional inessential assump- 
tions. Their only use is to reduce the number of cases that need to be 
considered when we define various functions representing syntactic opera- 
tions (cf. 3.2). They are: 

(i) The only logical connectives and quantifiers are 4, —, V. 

(ii) S and T centain as constants only the numerals: Ne eee 

(iti) Only numerical variables occur. 

(iv) T contains infinitely many n-ary function and relation symbols for 

each n. 

Thus, the language of T consists of: 

numerals: 0,1,..., 

numerical variables: vo, v1,..., 

n-ary function symbols: fo, fi,..., 

n-ary relation symbols: Ro, Rij,..., 

connectives: 4, >. 

quantifier: V. 

The other connectives and quantifier are considered to be abbreviations. 

We assume that S has a pairing function ( , ) with inverses 7, 72. Using 
them, we assign codes, which are closed terms, to the basic syntactic 
objects as follows: 


i+ (0,7) ih (4,4) 
v, + (1,7) > r (5,5). 


cu. D.1, §3] ENCODING 831 
fir 2(A,7)) WH 6,6) 
Ri G3, (A, 7)) 


Terms and formulae are finite sequences of these symbols and deriva- 
tions are finite sequences of formulae. Thus, S will have to be able to 
encode and manipulate finite sequences. In the following subsection, we 
introduce a nice class of functions and discuss their use for such encoding. 
In 3.2, we assume these functions are ‘“‘in” S and finish encoding syntax. 
3.3, 3.4, and 3.5 discuss some generalizations of the First Incompleteness 
Theorem that one can prove once one has an awareness of the encoding 
opportunities available. 


3.1. Primitive recursive encoding of finite sequences 


Loosely put, the primitive recursive functions are those functions of 
natural numbers that are obtained by recursion. Of course, to be obtained 
by recursion, they must be obtained from something and, to avoid minor 
unpleasantries, they must also be closed under explicit definition: 


3.1.1. DeFinition. A function f on natural numbers is primitive recursive if 
it can be generated after finitely many steps by means of the following 
rules: 


i f(x) =0, Zero 

ii f(x)=x +1, , Successor 

itl f(x) =x, Projection 

iv f(x) = g(hi(x),..., 4m(*)), | Composition 
f(0, x) = g(x), 

A Recursion 


f(x +1,x)=h(f(x, x), x, x). 


3.1.2. DEFINITION. A relation R C N" is primitive recursive if its represent- 
ing function, 
0 if R(x), 


Xr(x) = 
1 if “R(x) 


is primitive recursive. 


To facilitate the discussion of the encoding of finite sequences, we 
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establish a few simple closure properties of the classes of primitive 
recursive functions and relations. To do this, we need a few functions. 
Trivially, starting with the Zero function and iterating composition of the 
Successor function, we see that all constant functions are primitive 
recursive. Elementary school mathematics tells us that iterating Recursion 
shows that addition, multiplication, and exponentiation are primitive 
recursive. Subtraction takes us out of the domain of natural numbers; 
however, cut-off subtraction, 


x-y ifx2y, 
x-y= 
0 ifx<y, 
is primitive recursive. For we can define it by Recursion by 
x-0=x, x-(y + 1)= pd(x- y), 
where pd is defined by Recursion by 
pd(0) = 0, pd(x + 1) =x. 


__ Two more handy functions are the sign function, sg, and its complement, 
sg: 

sg(0)=0, — sg(x +1) = 1; 

sg(0)=1, — sg(x +1) =0. 


3.1.3. Lemma (definition by cases). Let g., g2, h be primitive recursive and 
define f by 


g(x) if h(x)=0, 
f(x) = 
g(x) if h(x) 40. 


Then f is primitive recursive. 
Proor. f(x) = gi(x)-sg(h(x)) + go(x)-sg(a(x)). 0 
3.1.4. CoroLcary. The relation of equality is primitive recursive. 
Proor. Let h(x, y)=|x-—y|=(x-y)+(y-x). O 
Note that sg and sg were used in somewhat of a logical manner in the 


above proof. To further illustrate this, let vr, ys be the representing 
functions of relations R, S. Observe: 


cu. D.1, §3] ENCODING 833 
Xe (x) = SB(Xe(2)), 
Xras(X) = sg(xex (x) + xs(x)), 
XRvs(¥) = Xx(X)* Xs(x). 
If we define bounded quantifiers, dy =x, Vy =x, then for R(y, x): 
Xayacr(X,X) = I Xr (y, x), 


XvyexrR — Xa3ysx-7R- 


3.1.5. LEMMA. (i) Let g(x,x) be primitive recursive. Then f, and f, are 
primitive recursive, where 


fi(x, x) = >» &(y x), fil x) = I] 8(y, x). 


(ii) If R(y, x) is a primitive recursion relation, then the relations S, T are 
also primitive recursive, where 


S(x,x)oady =xR(y,x), T(x, x) oVy =xRfy,x). 
The proof of this lemma is left to the reader. 


3.1.6. DEFINITION (bounded y-operator). Let g(y,x) be a function. We 
define 


f(x, x) = wy <x [g(y, x) = 0] 


by f(x,x)= the least y <x such that g(y,x)=0, if such a y exists, and 
f(x, x) = x, otherwise. 


3.1.7. Lemma. If g(y,x) is primitive recursive, then so is py< 
x [g(y, x) = 0]. 


Proor. Define 
0 if dy =x [g(y,x) =], 
fi(x, x) = 
1 if Gay =x [g(y, x) = 0]. 


fi is primitive recursive and we may define 


f(x,x) = »y fity,x). O 
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We have enough tools at hand to show the primitive recursiveness of the 
well-known pairing function, 


(x,y) =2((x + y/+3x+y), 


and its inverses. We simply use the bounded yu -operator: 


(x,y)= pz <(x+yPt+3x+y+1[2z =(x+ylt+3x+y], 
mz=px<z+ [Sy =z (ix, y)=z)], 


mz = py <z+1[(mz,y) =z], 


where we use the fact that x, y = (x, y). 

To encode finite sequences, we use the Fundamental Theorem of 
Arithmetic, whereby every natural number = 2 has a unique representa- 
tion: 


"o 


a = pit oe pik, 


where p,,...,p;, are distinct primes and all n; are positive. We have the 
following definitions: 


j x|y dz <y (xz =y). 
il x<yeazsy(yH=xtz+1). 


iil x is prime o x40Ax41AWz Sx[z|x>z=xvz=1). 


iv p.=n-th prime: po=2, 
Pai = UX < pal +1 [pa <x Ax is prime]. 
Vv a€Seqea=l1va>1aVvx<a([p,.:|a—>p, |a). 
0 if ag Seqva =1, 
vi lh(a) = 
ex <al[p,|arprit al ifaeSeqana#¥ l. 


ytl 


vii (a), =py Sxtl[pr"|an pry a]. 


a- [[ p@stt., ifaAln bA41, 


aoe = x slh(b) 
sa ce oe We if b=1n ax, 
b ifa=1. 
We should comment on v—viii. Seq denotes the set of sequence numbers, 
i.e. those numbers with no gaps in their list of prime divisors. For such 


numbers, we have 
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a= [] pe". 
isth(a) 
If a,b are sequence numbers encoding (ao,..., 4m), (bo,...,5,), respec- 


tively, then a*b is a sequence number encoding the concatenation 
(Qo, aoe) Ams bo, ees b,,). 
We write (ao,...,4,) for 2%*'--+p%*'. In particular, (a) =27*' and 


( )=1. 


An immediate application of these notions is the following. 


3.1.8. LEMMA (course-of-values recursion). Let g, h be primitive recursive 
and define f by 


fO,x)= g(x), f(x +1,x)=h(f(x, x), x), 
where f(x,x)=(f(0,x),...,f(x,x)) is the course-of-values function as- 
sociated with f. Then f is primitive recursive. 
Proor. Observe that f is primitive recursive: 
fO,x)= 2°, f(x + 1, x)= f(x, x)* (ACF (x, x), x, ¥)). 
But f(x, x) =(f(x,x)).. O 


We will use this lemma in the next subsection to finish our discussion of 
encoding. 


3.2. Primitive recursive encoding of syntax 


So far we have codes for basic syntactic objects (variables, numerals, 
etc.) and a primitive recursive technique of encoding finite sequences. We 
now combine what we have to encode more complicated syntactic objects. 


3.2.1. DEFINITION. We generate codes for complex terms and formulae as 
follows: 


(i) If f,...,t, have codes 't,',...,"t,', then 
Reet, SCF toe), 
IRI tt = CORT gl... tl), 


where 'f?', ‘Ri are the codes assigned in the introduction. 
(ii) If yg, w, have codes 'g', 'w', respectively, then 


i= gl =('A!,'¢!). lp ob =('','e!,'w). 
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(iii) If g has code 'g' and »; is a variable, then 
Vog! = (W!, ly) 'p!). 
[Note. This gives us the functions neg, imp: neg(x) = ('—1', x); imp(x, y) = 


(sx, y)J 


We now show that the complex syntactic notions are primitive recursive: 
(i) The representing function for terms is defined by course-of-values 
recursion: 
0 if Sisx[x =(0,i)vx =(1,))), 
T(x) 0 if x €Seqa Ani = x [(x)o = (2, (n, i) Alh(x)= na 
x => 
AWy <n(T((x),+1) = O], 
1 otherwise. 


(ii) Similarly, one defines the representing function for formulae: 


0 if x €Seqa Ani S x [(x)o = (3, (n, i) Alh(x)= na 
AWy <n (T((x),+1) = 0], 


0 if x © Seqa (x) ='' a F((x),) = 0a Ih(x) = 1, 
F(x) =4 0 if x €Seqn(x)o='S' a F((x),) = F((x)2) = 0 a Ih(x) = 2, 
0 if x €Seqa (x)= 'W' a Fi Sx (x): = (1, i) 
A F((x)2) = 0.4 lh(x) = 2, 
1 otherwise. 
3.2.2. Sub 


In Section 2.1 we spoke of a substitution function. As it is, we need two: 
one to replace a (code for a) free variable by a (code for a) term and one to 
replace a (code for a) free variable by a (code for a) numeral which 
supposedly designates the same number designated by a given term. The 
former is needed, e.g., to recognize axioms such as Vx yx — gt. Both could 
be used for the Diagonalization Lemma; but the latter is needed if one 
wants a free-variable form of the Diagonalization Lemma. Other uses are 
hard to describe here and we leave the function to “‘speak for itself” when 
we apply it later. We define the first syntactic substitution function by 
course-of-values recursion, first treating terms and then formulae: 
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sub('v/, i, y) = y, 
sub('v),iy)='v), =f ¥i, 
sub('f?r--- tl, y) = (fi, sub(t,', i, y),...,sub('s), i y)), 
sub(R?in--- tli y)= (RM, sub('t,, i, y),...,sub('s", & y)), 
sub(' g!, i, y) = ('", sub(‘g', i, y)), 
sub('p > w',i,y) =('>!,sub(‘e!, i, y), sub('y', i, y)), 
sub('Vu.0 |, i, y) = 'Wog!, 
sub('‘Wv, ¢', i, y) = ('W!, ‘vy, sub('g!, i, y)), 
sub(x, i, y)=0, x not of the above forms. 
With this definition, one easily sees that, for any term f, 
sub('pu, i, 't!) = ‘or’. 
If we observe that v; occurs freely in ¢ iff sub('g', i, (0, i)) ¥ 'g', then we 


can primitive recursively define the function sub referred to in Section 2.1 
by 


sub(x,i,y) if § = j <x [v, occurs free in x], 
sub(x, y) = 
x if i does not exist. 


subs, sub,, etc. are defined by iteration. 
A second important substitution function is 


s(x, y) = sub(x, (0, y)). 


This satisfies: If gu; has only vu; free, ¢ is a closed term denoting n, then 
s('pu', t) = 'w!, where & is equivalent to pA. (To prove & © of, one need 
only show + t = A. For then, substitutivity of equality yields tga! = 'p!.) 
Moreover, if g has only vu, free, s('g!, x) is formally the code of a sentence. 
We often abbreviate s(‘px', y) by 'py!. 


3.2.3. Provr 


The next step is to define Prov+(x, y). A derivation of @ is a sequence of 
formulae such that each element of the sequence is either an axiom of T, a 
logical axiom, or a consequence by some logical rule of earlier members of 
the sequence. The logical axioms can be taken to be primitive recursive in 
the sense that the set of codes of such axioms is primitive recursive. We 
leave it to the reader to check this for his favorite axiomatization. Further, 
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by clever choice of axioms, we can assume that the only rule of inference is 
modus ponens: 


mp, £-e> 4 

p 
We assume that the set of (codes of} axioms of T ts primitive recursive. 
Thus, we get: 

Provr(x, y) <> x € Seq a Vi S Ih(x)[(x); is a logical axiom v 
v (x), is an axiom of T v 
v Ak <i (x). = imp((x);, (x)))] A 
Ay = (xdnw. 
Prr(y)}< Ax Provr(x, y). 

3.2.4. S, I: Numeralwise representability 


A relation R C N" is said to be numeralwise represented or numerated by 
a formula ¢ in S if one has, for all m,---m,, 


Rm,---m, istrue iff St om-:: my. 
R is binumerated by o if one also has 
Rm,::-m, is false iff SH om,-:- m,. 


We assume that S binumerates every primitive recursive relation. To do 
this, it suffices to find a formula g,, for every primitive recursion function f, 
such that all numerical instances of the defining equations for f are 
provable. E.g. if f is defined by Recursion from g, h, then for all 
Mi,...,M,, M, 


St Alx gy, (0, rity... Hin X) AWX [95 (0, Ht, ..., Fta X) > Ge (M1, tn X)], 
Stalxg,(m +1. m.,..., tnx) 
AWx, y [@;, (MH, My... Mtn YA 
A py (m +1, 77), ..-, tn X)— Gay. My, 2, Mn, X)). 


Then, a metamathematical induction on the number of steps it takes to 
generate f (and on the first argument of f in the case of definition by 
recursion) shows that g, binumerates the graph of f. 

Given that S binumerates all primitive recursive functions (and hence all 
primitive recursive relations), it follows that all of the primitive recursive 
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encoding functions are binumerated: neg, imp, sub, s, and Provy. This last 
fact allows us to verify D1: 


Tog © St Prov+(t,'¢') for some closed term 1, 


> StPrr('g!). 


A quick rereading of the proof of the First Incompleteness Theorem will 
show that we did not need D2 and D3 to establish it. Thus we have given 
(modulo only a little handwaving) a complete proof of this Theorem. 


3.2.5. S, II: D2. and D3 
The Second Incompleteness Theorem does not come so cheaply. For D3, 
one must show 


St Provr(a, 'g!) a Provr(b, 'g > Ww!) Provr(a * b *('), 'w’), 


with free variables a, b. To do this requires much more than mere 
binumerability of primitive recursive relations: The representations must 
be correct with free variables. For this, S must be able to prove not merely 
each instance of the defining equations for a given primitive recursive 
function, but must also prove the equations with free variables. It will also 
be necessary for S to prove induction on primitive recursive relations. 

The latter necessity is clear if one considers D2. D2 is a formal version of 
the following sharpening of D1: If f is primitive recursive, 


(*)  fmioom, =m >THfm,- om, =m 
> St Provr(d(mi,..., Mn), fri, +> mi, = m!'), 


for some primitive recursive function d. To see that such a d exists, observe 
that a proof that fm = m (in S, and hence in T) is almost just a computation 
— it will be given by a sequence of equations and implications of equations. 
Thus, the existence of d satisfying (*) is not too surprising; nor is the fact 
that we claim: 


(+) St fir = y+ Prove(dr, 'fé = J"). 


The proof, which is too long to be included here, proceeds by 
metamathematical induction on the number of steps needed to generate f 
and (when the recursion clause is used) formal induction on the primitive 
recursive relation (**). Once one has accepted (**), the primitive recursive- 
ness of Provy yields 


St Provr(x, y)— Prr('Provr(x, y)'), 
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and one needs only apply D1, D3 to the valid formula gx > 4x x, to 
conclude the validity of D2. 


3.2.6. S, III: Choice of S 

By the preceding discussion, one can adequately encode syntax in S if S 

admits a representation of primitive recursive functions in such a way that 
(i) the defining equations for primitive recursive functions are provable 
with free variables; 

(ii) induction on primitive recursive relations is provable; 

(iii) computations are almost derivations of the equations they establish. 
We list three examples of such theories. 

(a) PRA = Primitive Recursive Arithmetic. PRA contains the numerals 
0,1,... and there is a function symbol in PRA for each (definition via the 
rules for the generation of primitive recursive functions of a) primitive 
recursive function. In addition to some trivial axioms concerning the 
constants and the successor function, the axioms of PRA are the defining 
equations of the functions and induction on quantifier-free formulae. 

(b) PA = Peano’s Arithmetic. PA also has the numerals as its constants, 
but it only has function symbols for successor, addition, and multiplication. 
The axioms consist of trivial axioms concerning the constants and the 
successor function, the recursion equations for addition and multiplication, 
and induction on all formulae of the language. Even proving that PA 
binumerates the primitive recursive functions requires another encoding 
trick. The most famous technique uses the Chinese Remainder Theorem to 
encode finite sequences. Conditions (i) and (ii) are proven by formalizing 
the use of the encoding of finite sequences and is non-trivial insofar as very 
few texts give the details. Condition (iii) can be bypassed by observing that 
the representation of a PR function can be written in the form 4x, where 
g is much simpler syntactically than the corresponding primitive recursive 
function. E.g., by Matiyasevich’s Theorem, g can be taken to be an 
equation involving two polynomials. Thus, the formalization of 
gx — Prp,('gx') is much simpler. 

(c) ZF = Zermelo-Fraenkel set theory. This is both a good and a bad 
example. It is bad because the whole encoding problem is more easily 
solved in a set theory than in an arithmetic theory. By the same token, it is 
a good example. 


3.3. Rosser’s Theorem 


By Section 3.2, binumerability of primitive recursive relations in S 
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suffices for the First Incompleteness Theorem — as a condition on S. There 
is still the necessity of assuming something about T — 

(i) that T contain S, 

(ii) that T be consistent, and 

(iii) (for the second half of the theorem) that theorems of T of the form 
Pr,(‘p') be true. 
Rosser’s Theorem allows one to drop the last soundness condition on T by 
using a modification of Provr: Define 


Provr(x, y)< Provr(x, y)a 


AWzw =x [Provr(z, w)— yA neg(w) a wH neg(y)], 


Prr(y)< 3x Prove(x, y), 


Conte 7 Prv({A!). 


3.3.1. Rosser’s THEOREM. Let Tt ge @—Prx('g!'). Then 
(i) T¥ 9; 
(ii) T¥ 49; 
(iti) Tt Conf. 


Proor. (i) By the consistency of T, Prov; and Provr binumerate the same 
relation. Hence D1® holds: Tt # > + Pr#('w'). Thus, the proof of the first 
part of the First Incompleteness Theorem yields the result. 

(ii) This follows from (iti). 

(iii) We leave this’ to the reader along with the remark that T is 
consistent and TFTA. O 


*3.4. Recursion theory 


(The reader is referred to Chapter C.1 for a full discussion of recursion 
theory.) 

Historically, recursion theory developed out of the incompleteness 
theorems. Once one knows a little recursion theory, however, it is natural 
to look back. 


3.4.1. DEFINITION. A set SCN of natural numbers is recursively enumer- 
able (r.e.) iff for some primitive recursive relation R, 
Sx & Ay Rxy. 


An equivalent definition is: 
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3.4.2. Derinition. A set SCN is r.e. iff S=@ or, for some primitive 
recursive function f, S = ran(f). 


Another useful concept is given by the following definition. 


3.4.3. DEFINITION. A set SCN is recursive iff S and N—S are both r.e. A 
function f:N—>N is recursive iff its graph (viewed as a subset of N by 
means of a primitive recursive pairing function) is recursive. 

The recursion-theoretic counterpart of the First Incompleteness 
Theorem is: 


3.4.4. THEOREM. There is an r.e. non-recursive set. 


One proves this by finding an enumeration, Wo, W,,..., of r.e. sets and 
an r.e. set K, such that 


Vxy (x, y)E Ki & x € W,). 


Then K = {x: (x, x) € K;,} is r.e. with a non-r.e. complement. One then 
proves the First Incompleteness Theorem by showing that K can be 
numeralwise represented in T. If T is sound enough, this is not too difficult 
— one uses the numeralwise representation of K in S that arises from the 
binumeration of the primitive recursive relation that K is the projection of. 
If T is not very sound, the numeralwise representation of K in S may not be 
one in T as T may Simply prove more numbers to be in K. One usually 
avoids difficulties with numeralwise representations in unsound theories by 
means of the following: 


3.4.5. DEFINITION. Let A, B CN be disjoint r.e. sets. A and B are effec- 
tively inseparable iff there is a recursive function f such that for all r.e. sets 
W,, W,, if 

(i) Wi W, =, 

(ii) ACW, BC W, 
then f(i,j) € W, U W,. 


3.4.6. THEOREM. Effectively inseparable r.e. sets exist. 


We shall accept this on faith. 

To prove that part of Rosser’s Theorem that corresponds to the First 
Incompleteness Theorem, one constructs g, y% which numeralwise repre- 
sent A, B, respectively, in S and for which 
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(*) SEVx (gx a x). 


Then, if T is a consistent formal theory, the set of (codes of) its theorems 
can be shown to be r.e. and one defines 


W, ={n: TH gh}, W, ={n: T+ i}. 


By (*), W.N W, =9. Since T contains S and g numerates A in §, it follows 
that AC W, BC W, and no= f(i, fj) € Wi U W,. Thus TF pho, 7 fo. 


*3.5. The formula hierarchy 


The purpose of the present subsection is mainly to establish some 
notation for several more advanced sections below. 

Recall that S is assumed to have, for each primitive recursive function f, 
a formula g, representing it in the strong sense of 3.2.5. g, is called a 
primitive recursive formula, or a PR formula. 


3.5.1. DEFINITION. A formula ¢ is =, (II,) iff for some PR formula y, 
QP = Qix, are QXnll, 


where Q, = J (WV) and the quantifiers alternate in type. We write g EX, 
(II,), ambiguously, as g is =, (II,) or provably equivalent (in S) to a %, 
(I,) formula. 

Thus, one has the inclusions: 


pean < 


i—n 


yaaa 


3.5.2. THEOREM. There isa >, truth definition for 2, formulae. I.e., for each 
n, there is a formula Try, € 2, with only the numerical variable x free such 
that, for 9. © Xn, 


St ox oTrs, ("ex !). 


A similar result holds for II,. 
We omit the proof and note the following. 


3.5.3. COROLLARY. The formula Trz,(s(x,x)) is %,, non-II,. 


The proof is left as an easy exercise to the reader. 
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It follows that all the inclusions indicated above are proper. Thus we 
have a genuine hierarchy. 

From our point of view, there are two uses of this Hierarchy: First, since 
it is a hierarchy, we can use it to measure the complexity of formulae or of 
sets of formulae. Such use is made in Section 4. A second use is not of the 
Hierarchy itself taken as a hierarchy, but rather of Theorem 3.5.2: Many 
set-theoretic proofs could be carried out in arithmetic if one had a truth 
definition. Tarski’s Theorem asserts that there is no truth definition for the 
entire language [Exercise]. By Theorem 3.5.2, there are partial truth 
definitions Tr, Tr,,... such that Tr, works up to the n-th level of the 
Hierarchy. This allows the formalization of certain outwardly set-theoretic 
constructions within arithmetic. An application is discussed in Section 6. 

Before proceeding, it is worth noting the following. 


3.5.4. Fact (demonstrable %, completeness). If ¢ € 4, then 
St px — Pry(‘px'). 


This follows from the discussion of 3.2.5. 


4. Metamathematical properties other than consistency 


Metamathematically, consistency is a minimal assumption on a theory. 
One might wish for stronger properties to hold — e.g. w-consistency. If T is 
a theory about a particular structure, as PA is a theory about the semiring 
of natural numbers, one might wish for even more — soundness (anything 
provable is true) or completeness (anything or its negation is provable — 
hence anything true is provable). 

In this section, we discuss these properties. In 4.1 we consider a 
soundness scheme — the Reflection Principle. w-consistency 1s discussed in 
4.2 and its relation to the more intuitive Reflection Principle is presented. 
Completeness, which we know to be false, nonetheless gives rise to 
consistent schemata. These are discussed in 4.3. 


4.1. Reflection principles 


The First Incompleteness Theorem is proven by considering the sen- 
tence that asserts its own unprovability. Under minimal assumptions, it is 
clear that the sentence must be true — and hence unprovable. But what 
about the sentence that asserts its own provability ? Is it true? false? It was 
precisely this problem that led to the following important theorem 
characterizing provable instances of the Reflection Principle: 
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4.1.1. LOp’s THEOREM. Let @ be closed. Then 
TrPri(lo oe iff Trg. 


Proor. The one direction is obvious. For the other, assume that T ¥ g. 
Then T + —g is consistent and we may appeal to the Second Incomplete- 
ness Theorem to conclude that T+ —- ¢ does not yield Conr.—,, hence not 
7 Pry('4¢ > A'!). Thus 


T+ ¢@ ¥ APrr('g!). 


Contraposition yields T ¥ Prr(‘g')> ge. O 


As hinted above, this solves the problem of sentences asserting their own 
provability — such sentences are provable (and hence equivalent — cf. also 
5.1). This also focuses our attention on the following schemata: 


Local Reflection Principle 


Rfna(T): Prr('e')> @, ¢ closed. 

First Uniform Reflection Principle 

RFN(T): Vx Prr('ex!) Vx ox, g has only x free. 
Second Uniform Reflection Principle 


RFN‘(T): Vx [Prr('ex')— px], @ has only x free. 


[A stipulation must be inserted here: As indicated by the notation 'px', 
the variable x must range over elements which can be named by constants. 
Thus, we insist that the x in the uniform versions of the Reflection 
Principle be a numerical variable. In fact, throughout the following, we 
shall assume that all variables explictly exhibited are numerical variables, 
although non-numerical variables may occur unexhibited in the formulae.] 

The reflection principles are clearly schematic assertions of soundness — 
anything provable is true. As such, they immediately imply consistency 
and, thus, we see that they are underivable in T. Of course, Theorem 4.1.1 
tells us more than this: It characterizes the provable instances of Rfn(T). 
Nonetheless, it may be instructive to restate the First and Second Incom- 
pleteness Theorems in terms of the reflection principles: 


4.1.2. First INCOMPLETENESS THEOREM. For some true, unprovable 9, 


T¥ Prr(‘e')— ¢. 
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4.1.3. SECOND INCOMPLETENESS THEOREM. For any refutable 9, 


T¥ Prr(‘e'!)> @. 

Let us check that these statements are equivalent to the more familiar 
versions. The First Incompleteness Theorem is no problem if we specify 
that the true unprovable sentence we have in mind is the one asserting its 
own unprovability. Then Theorem 4.1.1 simply yields 


T¥Prr(le')>¢@ iff T¥g, 


where the two equivalents are the two versions of the First Incompleteness 
Theorem. For the Second Incompleteness Theorem, observe that, for 
refutable gy, the following are equivalent over T: 


Prrlg')>@ —aPrr('g!) —Prr('A') Conr. 


Theorem 4.1.1 again yields the equivalence of the two versions. 

Thus, L6b’s Theorem is a generalization of the incompleteness 
theorems. While this alone would justify taking a closer look at the 
reflection principles, it might be worth our while to mention another 
motivating factor. Recall that the main impetus behind Hilbert’s Consis- 
tency Program was the fact that consistency was equivalent to soundness 
for real statements: 


4.1.4. THEOREM. Over S, the following are equivalent: 
(i) Cony; 
(ii) Rfnu,(T); 
(iii) RFNu(T); 
(iv) RFNi,(T); 
where the subscript ‘‘I],”’ indicates restriction of the schemata to g ETI. 


Proor. The implications (iv) — (iii)— (ii) are fairly direct. (ii)— (i) follows 
from the above observation that Con; < (Prz('¢!)— @) for any refutable g 
— one merely chooses such a g EI]. 

(i)— (iv). Let g ETI, have only x free. Then gx €%, and, by 
demonstrable %,-completeness (see 3.5), 


(*) St gx > Pry('7 ex!). 
But 
(**) S + Const Prs(‘ex!) > 4 Pro('7 ¢x!), 


whence (*) and (**) combine to give 


S+Conzt Pro('px!')> ox. OF 


cH. D.1, §4] PROPERTIES OTHER THAN CONSISTENCY 847 


The interested reader is referred to 5.2 for some applications of Theorem 
4.1.4. For the moment, we simply use it as our second reason to justify our 
interest in reflection principles: Consistency is equivalent to a restricted 
Reflection Principle. 

Having decided that reflection principles are worth studying, we may as 
well begin. First, let us observe that the schemata are listed in full 
generality. For one thing, we must restrict ourselves to schemata since the 
sentence 


Vx [Pre('ex!) > Tr('ex')], 


where Tr('q!) asserts ‘tw is true’, cannot be asserted in T. For, by Tarski’s 
Theorem on Truth Definitions, there can be no truth definition for T within 
T itself (cf. 3.5). Further, extra variables in either version of the uniform 
scheme can be contracted by means of a pairing function, reducing the 
general scheme to the two listed. A hybrid, e.g. 
Vx [Wy Prs(‘pxy')—> Vy exy], is clearly implied by the several variable 
Second Uniform Reflection Principle. Finally, we have the following 
theorem. 


4.1.5. THEOREM (Feferman). RFN(T) and RFN'(T) are equivalent over S. 


Proor. Obviously, the instance, Wx Pry(‘ex!)>Vx yx, of RFN(T) is im- 
plied by the corresponding instance of RFN'(T). REN’(T). The converse 
requires a minor (but often useful) lemma: 


4.1.6. Lemma. St Pr+('Prove(y, 'ex')— @x!). 


Proor. (a) By D1 and D3, 
St Provr(y, '@x!')— Prz('px!) 
= Pry('Provr(y, gx!) gx). 
(b) Since — Prov; € PR, we similarly have 
St Provr(y, '¢x')— Pre('— Prove(y, 'gx!)!) 
— Prz('Provr(y, 'ex!)— ox!). 


Combining (a) and (b) yields the lemma. 0 


To complete the proof of Theorem 4.1.5, let g be given and observe 


St Vx [Prr(‘px')— px] Vxy [Provr(y, 'ex!)> ex]. 
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But the right-hand side of this equivalence is derivable in S + RFN(T) by 
Lemma 4.1.6. 0 


Before proceeding further, it is amusing to note the following provable 
instance of reflection: 


(*) St ay Prz('Provr(y, 'g')') > Jy Provr(y, '¢!), 


ie. St Ay Prr('Provr(y, 'g')!)—> Pre(‘e!). (*) follows immediately from 
Lemma 4.1.6 and condition D3 and we leave its derivation to the reader. 
By Theorem 4.1.4 and the Second Incompleteness Theorem, the variable y 
on the right side of (*) cannot in general denote the same code for a 
derivation as the y on the left. We can also see this by appeal to the 
following free-variable form of L6b’s Theorem: 


4.1.7. THEOREM. Let o have only x free. Then 
THVx Prr('gx')> ox] iff THVx ex. 


We omit the proof. 

So far, we have shown that the use of reflection principles allows a 
generalization of the incompleteness theorems, that Con; is equivalent toa 
restriction of the Reflection schemata, and that RFN(T) is as general as 
RFN’‘(T) and further schemata with additional variables. We ought to ask 
ourselves the simple question: How much of an improvement is Reflection 
over Consistency? Obviously, consistency does not imply soundness — e.g. 
T=PA+—Conp, is consistent but not sound as —Conp, is a false 
theorem of T. (For this same T, however, T+ Con;+RFN(T) — for 
T+ —-Con,.) A first step is given by the following simple lemma: 


4.1.8. Lemma. Let ¢ be closed. Then 
(i) T+ ¢ + Rfn(T)+ Rfn(T + ¢), 
(ii) T+ g + RFN(T)} RFN(T + ¢). 


Proor. Observe that for any &, we have the following over S: 
St Pred!) Prr('e > w'). O 
4.1.9. CoroLvary. Let o be closed. 
(i) Let T+ 9 be consistent. Then T+ ¢— ¥ Rfn(T). 


(ii) Let T’ be a consistent finite extension of T. Then T’ ¥ Rfn(T). 
(iii) If T+ Cony is consistent, then T + Con; ¥ Rfn(T). 
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*4.1°. Hierarchy considerations 


By Corollary 4.1.9, neither Rfn(T) nor RFN(T) is implied by any finite 
set of axioms consistent with T. We devote the rest of this subsection to 
discussing an often useful improvement of this in the uniform case. For this 
purpose, we must use the notions and notations of the Formula Hierarchy 
(discussed in 3.5). This material is less detailed and may be omitted on first 
reading. 

Let RFNn, (T) denote the restriction of the scheme RFN(T) to formulae 
in Il,. Similarly, one defines RFNs,(T), RFNn,(T), and RFN:,(T). A first 
result is: 


4.1.10. THEOREM. Over S, the following are equivalent (k = 0): 
(i) RFNs, (T), 
(ii) RFNn,,,(T), 
(i.a) RFN3,(T), 
(ii.a) RFNna,.,(T). 


The equivalences (x) <> (x.a) follow by taking a closer look at the proof of 
Theorem 4.1.5. The implications (ii)— (i), (ii.a)— (ia) are trivial as 2, € 
II,.,. For the converses, one uses provable closure under numerical 
substitution: St Prs('Wx ox!) Wx Prr('px!). 


In terms of the Hierarchy, Lemma 4.1.8 can be restated: 


4.1.11. THEorEeM. Let o EH, be closed and letn =k. Then 
(i) S+o+Rfnz,(T)} Rfns, (T+ ¢), 
(ii) S+ g + RFNs, (T)+t RFNs,(T + ¢). 


This is seen by observing that, if »E%,, then p> wp EX,. 

Using a II, truth definition for II, formulae, it can be shown that 
RFN;;,(T) can be written as a single II, sentence. Bearing this in mind, we 
have: 


4.1.12. Corotiary. (i) T+ RFNxg,.,(T) RFNz,.,(T + RFNu,.,.(T)), k 20. 
(ii) T+ RFNu,.,(T)/ RFNnu,.(T + RFNu,(T)), k = 1. 
(iii) If T+ RFNn,(T) is consistent, then 


T+ RENn,(T)/ Rins,(T), k= 1. 


(iii) If T+ Cony is consistent, then 
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T+ Con; ¥ Rfns,(T). 


ProoF. Parts (i) and (ii) follow from Theorems 4.1.10 and 4.1.11. Parts (iit) 
and (iii’) are simple applications of the incompleteness theorems in the 
forms of Theorems 4.1.2 and 4.1.3. O 


By Corollary 4.1.12, RFN is not implied by any (consistent) bounded set 
of its instances. The following theorem of Kreisel and Levy improves this 
immensely: 


4.1.13. EssENTIAL UNBOUNDEDNESS THEOREM. Let n be given. Let U be an 
re. theory (not necessarily containing S) in the language of T. If 
TtRFN(U), then no consistent extension of U by a set A of 2, sentences 
implies all theorems of T. In particular, T cannot be axiomatized over U by 
any set of %, axioms. 


Proof. Let Tr, be a %, truth definition for 2, formulae. Define w by 
St Wx [Tr, (x) 7 Pro(imp(x, 'v!))]. 


Intuitively, y& asserts its unprovability from any true %, sentence. 
(a) Since Tt RFN(U), it follows that 


Tt} Vx [Pru(imp(x, '&'))— (Tr, (x) W)]. 
Thus 
Thaw Vx [Tr (x) v 7 Pro(imp(x, ''))] 
—> Wx [Tr, (x) — Pru(imp(x, 'y!))] > wy. 
Thus TE &. 
(b) Suppose U+ A extends T; A C%,. Then U+ At y. We now show 
that this implies the inconsistency of U+ A: Since U+ A t y, it follows that 


U+X tw for some finite X C A. Let gy = A X. Then UF g > W. But this 
implies 


(i) THPr('e >). 
Since Tt y&, 
(ii) THPr('g > b')> TT, (‘¢'). 


But » © 2, and 
(iii) Trg oTr,(‘¢!), 
and (i}(iii) yield TH mo. O 
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The Essential Unboundedness Theorem is a useful tool for proving 
unboundedness theorems — results of the form: Axiomatizing T, over T, 
requires arbitrarily complicated formulae. The most basic example is that 
in which T2 = PC: 


4.1.14. DeFinition. T is said to be reflexive if THRFN(PC), where PC 
denotes the predicate calculus (as formulated in the language of T). 


We remark that, by Lemma 4.1.8, if T is reflexive, Tt RFN(U) for all 
finite subsystems U of T. In particular, T+ Cony for such U. 


4.1.15, REFLEXIVENESS THEOREM. PA and ZF are reflexive. 


The usual proof of this theorem is too long to be given here. For ZF, we 
can give the following simple proof: By formalized induction on the length 
of a derivation, 


ZFE Wx [Prec('ex!) > Wa (Trans(a) 1x €a> ox)], 


where gy“ denotes the relativization of g to a and Trans(a) asserts that a 
is transitive. By the set-theoretic reflection principle, 


ZF} Vx [a gx > da [Trans(a)n x Canne™x]], 
whence 
ZF F Vx [Prec-(‘ex Na x). 


As corollaries, we see that (i) the induction scheme of PA is not implied 
by any bounded set of its instances, and (ii) one cannot bound all the 
schemata of ZF. 


*4.2. w-consistency 


The concept of w-consistency was introduced by Gédel for the purpose 
of stating the hypotheses needed for the First Incompleteness Theorem. 
The w-consistency of T is neither the optimal nor the most intuitive 
condition sufficient for the theorem. Nonetheless, its use here is so firmly 
entrenched in the literature that we are obligated to comment on it. 

Informally, w-consistency is the property that holds of T if the following 
two conditions are not simultaneously satisfied for any ¢: 

(i) Trax yx; 

(ii) TH 90,7 ¢@1,.... 

Formally, w-consistency can be represented (in varying degrees of general- 
ity) by (modifications of) the following scheme: 
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4.2.1. 
Pry (‘3x gx!) 3x 4 Pre('A gx!). 


Let 1-Con, denote the restriction of 4.2.1 to ¢ © PR possessing only one 
free variable. 


4.2.2. FORMALIZED First INCOMPLETENESS THEOREM. Let g be —Pr-('g'). 
Then: 

(i) T+ Conz+ 4 Pry ('e!), 

(ii) T+ 1-Conz+ 4 Pry(' g!). 


Proor. Part (i) was shown in the course of the proof of the Second 


Incompleteness Theorem. 
For part (ii), let gp =VWxwx, fw E PR. Then 


T + 1-Conyt Prx('4 g') > Ax MF Pre('Wx!)— Ax A x, 

by demonstrable £,-completeness (3.5.4). Thus, 

(*) T+ 1-Conrt Prx('Ag!) > @ 
(**) = Prr('g!). 
But 1-Cony yields Con; since it asserts the unprovability of something. 
Thus, by (i), 

T+ 1-Conrt 4Prr('g!), 

which, with (**), yields (ii). O 

Probably the most important thing to notice about the above proof is 
that 1-Cony was used only to derive (*): Prr('7¢!)— 4, for closed 


gy Ell, — ie., 1-Cony was used only to derive Rfns(T). Conversely, 
Rfnz,(T) can be used to derive 1-Conz: 


S+ Rfny,(T)+ Prr('4x gx!) 3x g, 
> 3x 7 Prs('7 ¢x!), 


since 7 gx © Pry('!— gx!) by demonstrable PR-completeness and the fact 
that Rfns,(T) implies Cony. 

By the preceding paragraph, Rfns,(T) and 1-Conyz are equivalent. Since 
the statement 4.2.1 of w-consistency is easily seen to be 22, Corollary 4.1.12 
shows that we cannot expect such behavior to hold for more than some 
very special cases. 
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4.2.3. DEFINITION. We define the following forma! schemata representing 
w -consistency: 


Local w-consistency 

w-Cons: Pry('4x gx!) 3x 4Prx('4 @x!'), @ has only x free, 
Uniform w-consistency 

w-CON;z: Vy [Prr('4x pxy!)—> Ax 4 Prr(' pxy')], @ has only x, y free, 
Global w-consistency 


w-CON?: Vo [Prr('4x px!) Sx 4 Prx(' ox!)], 


where Vg indicates quantification over codes of formulae possessing only 
one free variable. 

We hasten to emphasize the fact that, unlike the case with reflection 
principles, we have here a global representation of the given concept as 
well as the local and uniform ones. The reason is simply that we only use ¢ 
in 4.2.1 in the form of a code and not, as with reflection, as a subformula of 
some larger formula. Thus, we can quantify over all g in the present 
context. 

It is not hard to see that w-CON?+ w-CONrt w-Conr over S. Local 
schemata are usually difficult to deal with and so we ignore w-Conr. Thus, 
we are interested in w-CONY and w-CON; — and in their hierarchical 
restrictions: 


4.2.4, DEFINITION. Let k =1. The restriction of w-CON, to formulae 
y © 3y-1 is termed k-CON;y and the corresponding informal concept is 
called k-consistency. 


Observe that, via a =, truth definition for 2, formulae, the correspond- 
ing restriction of w-CON? (Ve » Vo E &,) is equivalent to (k + 1)-CON;z. 
Further, as in Theorem 4.1.10, k-CON; is equivalent to the corresponding 
restriction for g € TIk. 

The following theorem characterizes these notions in terms of the more 
intuitive reflection principles: 


4.2.5. THEOREM. Over S, we have 
i k-CONz@ RFNuju,.,(T), (k = 1,2) 
ii k-CON+ << RFNnu,(T + RFNu, (T)), (k =2) 
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iii w-CON? @ REN, (T + RFN(T)), 
iii’ w-CONS% @ REN), (T + @-CON>). 


The proof is rather long and we omit it. Some related results are covered 
in Chapter D.2. 

[As an aside, we would like to mention the following: The formula 
hierarchy can, as an obvious use, be applied to obtain quantitative 
refinements: of various results. See e.g. 4.1°. Theorem 4.2.5 gives an 
application of a different order: The explication of w-consistency as a 
Reflection Principle (iii) presupposes an understanding of the expression 
“TI,” Ie. one must know something about the formula hierarchy even to 
state the relation between w-consistency and soundness.] 


4.3. Completeness properties 


Somewhat loosely, the First Incompleteness Theorem asserts that consis- 
tent strong formal theories are incomplete. Nonetheless, there are consis- 
tent schemata asserting completeness. We (need) consider only the local 
versions: 

Syntactic completeness 


SynCompr: Prr(‘o')v Pri @!), closed g. 
Semantic completeness 

SemCompy: ¢—Prr('e'), closed g. 
w-completeness 

w-Compr: Wx Prr('ex!)—> Prr('‘Vxex'), g has only x free. 


Without further ado, we state: 


4.3.1. THEOREM. The following are equivalent over S: 
(i) 4 Con; 
(ii) SynComp;; 
(iii) SemCompr; 
(iv) w-Compr. 


Proor. Obviously (i)— (ii), (iii), (iv). 
(ii) (i) We appeal to a Formalized Rosser’s Theorem: If @ is 
7 Pri('g!), then 


$+ Conrt 7 Pry('g'), 3Prr(‘ @!), 
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whence 
S + SynComprt 7 Conrz. 


(iii) (i) Here, one takes g = Cony and applies the Second Incomplete- 
ness Theorem. We omit the details in favor of the following case: 
(iv) (i) Let gx = 4 Prove(x,'A!). By Lemma 4.1.6, 


St Vx Pry(' Provr(x, 'A')), 
whence 
S + w-Comprt Prs('Wx — Provr(x, 'A')') 


t Pre('Prr(‘A')—> A!) 
tPry(‘A'), 
by the Formalized L6b’s Theorem: St Pry('Prx('g')— ¢')> Prx(‘g!). O 


*4,.3°. Kent’s Theorem 


By Theorem 4.3. (iii), the scheme g — Prr('g'), is equivalent to 7 Cony 
and, hence, is not in general derivable — not even when restricted to 
gy ET, (namely g = Conrz). We also know, from 3.5.4, that the subscheme 
gx > Pry('px!), @ E%,, is derivable. The following Theorem of Kent 
shows that the situation is even more complicated yet: 


4.3.2. THEOREM. For any n, there is a sentence g such that 
(i) Ste > Prs('¢'); 
(ii) For no EX, does Stgpow. 


Proor. First, let y be such that for no yw € &,, consistent with S do we have 
(*) Stwty or Stwptry. 


To construct such a x, we take a hint from the Essential Unboundedness 
Theorem and let 


x Vx [Tr, (x) 7 Prs(imp(x, 'x'))], 


where Tr, is a >, truth definition for >, formulae. Mimicking the proof of 
Rosser’s Theorem, we see that (*) fails for all Y € %, consistent with S. 

Now let g@ be y ”Prs(‘A'). Clearly (i) holds. To see (ii), suppose 
Stgeow for J €%,. Then Sty, so Sty, since otherwise (*) is 
true. Since —@ is (x APrs('A!)), Steg, so SEPrs('A')—> my, con- 
tradicting -(*) since $+ Prs(‘A') is consistent. [J 
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5. Two applications 
5.1. A fix-point theorem 
By diagonalization, one easily finds sentences y, y such that 
StyeorPr('y'), Sty Prr(x!). 


The proof of the Second Incompleteness Theorem and L6b’s Theorem, 
respectively, yielded the interesting facts that yw and y were not only 
unique, but explicitly definable: 


StwoConzo7Pr,('A!), 
Sty ot, 
where t = truth. An older proof of L6b’s Theorem uses the fix-point, 
6 <(Prr('6')— ¢), 
for any given g. A little algebra soon reveals 
9 <> (Prr(‘p')> ), 


whence @ too is explicitly definable — this time from the remaining 
variable. These turn out not to be isolated examples, but rather instances of 
a general result. 

To obtain a simple statement of this result, we consider a propositional 
language with propositional variables p,q, r,...; the usual connectives, a, 
Vv, —1, 3 propositional constants t, f for truth and falsity; and a modal 
operator CO to stand for provability. This will be an interpreted system 
rather than a deductive one: Given an assignment, p + ¢,, of sentences to 
propositional variables, we obtain a translation g, for each formula @ of 
the propositional language: 


Pap Pa ° Pg, So= AVI; 
Pra = Ta; Poa = Pre("ga'). 


If a(p.,..., Pn) is a formula of the propositional language and we assign ¢; 
to p, then we write a(W,...,Wn) fOr @acp.....ea» ie. for the result of 
substituting each y for the corresponding p, and Pry for O. 


5.1.1. THEOREM (De Jongh’s Fix-Point Theorem’). Let a(p,q) be such that 
p occurs only inside the scope of OQ. Then, for some B(q) and all sentences 
Wi,...,%, of the language of T, 


' A proof of Theorem 5.1.1 has now been published in SAMBIN [0000]. 
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St B(b) <> a(B(W), W). 
Further, B(ys) is, up to provable equivalence, the only fix-point of «a. 


The proof of this is too complicated to be included here. However, we 
can prove the following special case: 


5.1.2. THEOREM. Let a(p,q)= a@'(Oy(p,@),q), where in a'(x, y) the vari- 
able x does not occur inside the scope of a Q. Then the fix-points of a are 
determined parametrically by 


B(q) =a [Dy(a'(t, 4g), 4), 4]. 


Proor. Although we are interested in establishing the result with sen- 
tences ¢ replacing the variables q, the propositional notation is more 
convenient. An expression + 6(q)<> 6'(q) is understood accordingly. 

Since the Diagonalization Lemma holds, we may assume that we have a 
p such that + p<>a(p,q). It will follow from this that + p  B(q). 


5.1.3. LEMMA. For all r,t,6, ret, O(rt)t 6(r) &(t). 


This follows from the derivability conditions by induction on the length 
of 6. We omit the proof. 
To prove Theorem 5.1.2, we first show: 


(*) FOy(p,q)eOy(a'(t,@), ¢). 


By the fix-point assumption, t p< a'(Oy(p,q),q). Thus, since Oy(p,q) 
does not occur inside the scope of 0 in a’, 


(**) Ov(p,q)tOy(p,q)<t 
Fea'(Oy,q)oa'(t,q)tp<a'(t,q). 
The derivability conditions yield 
(re) Oy(p,q)F OO (p,q) O(p a(t, q)). 
Applying the lemma to (**), (***), we immediately have 
Ov(p,q)tOy(a'(t, q), 4), 


i.e. half of (*). 
For the converse, assume Oy(p,q)a— y(p,q). Then 


Oy(p.q) a y(p,q)t 4 y(a'(t, q), 4), 
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by the same reasoning as above. Contraposition yields 


y(a'(t,q),4) Oy (9g) v4), 
whence 


Ov(a'(t, ¢q),q) > O(Oy(p, gq) > v(p, 4) > Oy (Pp, 4), 
by the Formalized L6éb’s Theorem, 
St Prx('Prr('o!)— ¢')> Prr('¢!). 


This completes the proof of (*). 
To conclude the proof, observe 


B(q) = a [Oy(a'(t,q),4), 4] <a (Oy(p, q), q) by (*) 
> p, 


the latter’by choice of p. O 


5.1.4. Exampce. We list: (i) a(p,q), (ii) a'(p,q), (iii) yv(p,q), (iv) 
Oy(a'(t, g),q), (v) B(q), and (vi) a final simplification of (v): 


(a) (b) (c) (d) 


(i) “Op Cp Op->q O(p—4q) 
(ii) —p p pq p 
(iii) Pp Pp Pp pq 
(iv) O-t Ot D(t— q) O(t— q) 
(v) aOo7t Ot O(t>q)-4 O(t>q) 
(vi) Ot t Oq->q Oq 


5.2. Conservation results 


In this section, we present some conservation results of Kreisel. 

Recall that Hilbert’s Conservation Program called for a proof that the 
use of ideal statements and abstract reasoning led to no new real theorems. 
While the incompleteness theorems showed that this is in general impossi- 
ble, they do not rule out the possibility of success in special cases. In fact, 
we will even use the Second Incompleteness Theorem in establishing one 
conservation result. 

First, we present the main result: 


§.2.1. CONSERVATION THEOREM. Let g EI];. Then Thy > S+Conrk g¢. 


Proor. Let g¢ ET], and suppose Tt g. D1 yields $+ Prz(‘g'). But Conz is 
equivalent to RFNu,(T) over S by Theorem 4.1.4, whence S+ Conrt+ g. O 
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The next two results are corollaries: 
§.2.2. THEOREM. Let g EIl,.: Then T+ 1 Conrtg > Treg. 
PROOF. 

T+ 7Conrt ge D> T+ Congesconst @, by Theorem 5.2.1 
> T+ Conrt g, 
by the Formalized Second Incompleteness Theorem. But we have 
T+ Conrzt g, T+ —7Conrzt g, 

whence Tt gy. O 


§.2.3. THEOREM. Let T, T’ contain S and let 
(*) Tt Vx [Provr(x, 'p!)— Provr(tx, 'w')] 


for some primitive recursive term t. Then St Prz(‘p!)— Pro('p’). 


Proor. By the Conservation Theorem, 

S + Cony F (*) | Pry(ab!)—> am) Pry('g!), 
by contraposition. Now absorb Cony into —Pry('y!) and contrapose 
again. O 
5.2.4. COROLLARY (Relative Consistency Theorem). Let T, T’ contain § 
and let 


S+ Wx [Provr(x, 'A!)— Provr(tx, 'A')] 
for some primitive recursive term t. Then St Cony Conr. 


The corollary is worth commenting on. By the Second Incompleteness 
Theorem, one cannot prove consistency of strong theories within weak 
theories. Sometimes, the consistency of a strong theory is genuinely in 
doubt and one can give a relative consistency result, e.g. 


(#*) Congr — Congrseacn. 


A throwback to Hilbert’s Consistency Program is the demand that the 
proof of (**) be carried out within as weak a system as possible. Epis- 
temologically, there is no need to give a proof of (**) in, say, PRA: For, the 
value of (#*) depends entirely on the acceptance of ZF and one might as 
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well prove (**) in the latter theory — a technically easier undertaking. 
However, we can avoid philosophical bickering; for, by Corollary 5.2.4, if 
this is done at all nicely, it automatically follows that (**) can be proven in 
the weaker theory — namely PRA. Thus, there is nothing to argue about 
here. 

In the last paragraph, we pointed out how one conservation result caused 
a potential philosophical problem to vanish. Usually, the value of conserva- 
tion results is that they allow one to use stronger techniques to shorten 
proofs and conserve one’s energy. We refer the reader to Chapter D.3 for 
quantitative information. 


*6. The formalized completeness theorem 


There are several possible advanced topics that one could discuss. The 
close relation between induction principles and refiection principles (often 
bearing the misnomer, “consistency proofs’’) is discussed in Chapter D.2. 
One could discuss efforts to complete a formally incomplete theory by the 
iterated addition of reflection principles. Another topic concerns proof- 
theoretic applications of the reflection principles. 

We shall discuss the formalized completeness theorem and use it to give 
model-theoretic proofs of the incompleteness theorems. 

In this section, we set S= PA. 


*6.1. The Hilbert-Bernays Completeness Theorem 


Formalizing the Henkin completeness proof within PA yields: 


6.1.1. HiLBERT-BERNAYS COMPLETENESS THEOREM. Let U have a primitive 
recursive set of axioms. There is a A, set of formulae, Tm, such that in 
PA+ Cong one can prove that this set defines a model of U: 


PA + Conut Vx (Pru(x)— Tra (x)). 


Let us explain this: A formula ¢ is said to be A, if it can be written both 
asa, andaTII, formula. Theorem 6.1.1 asserts that, modulo Cony, one 
can prove in PA the existence of a model of U whose truth definition is A. 

The meaning of this is best understood by a description of the proof, 
which is just an arithmetization of the set-theoretic one: One adds to the 
language of U an infinite primitive recursive set of new constants Co, Ci,..., 
and adds the axiom 
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(*) Ax ex > e(Cie1) 


for each formula g. One then enumerates all sentences ¢o, ¢1,... of this 
augmented language and defines a complete theory by starting with U and 
adding, at step n, ¢, or gy, — according to whether ¢, is consistent with 
what has been chosen before or not. The construction is readily described 
within PA. Assuming Cony, one can also prove that the construction never 
terminates. The resulting set of sentences forms a complete theory which, 
by virtue of the axioms (*), forms a model of U. Inspection shows that the 
truth definition of the model is A2. 


*6.2. The incompleteness theorems 


Scott was the first to observe that one can give a model-theoretic proof of 
the First Incompleteness Theorem: 


6.2.1. First INCOMPLETENESS THEOREM. There is a sentence g such that 
(i) PAY » and (ii) PAY TV. 


Proor. Assume PA is complete. Then, since PA is true, PAt Conp, and we 
can apply Theorem 6.1.1 to obtain a formula Try which gives a truth 
definition for a model of PA. Choose g¢ by 


PAt g Tru ('¢!). 


We claim PA ¥ g, PA¥ — @. For if PA} g, then PAE Tra ('p') so PAE 9. 
Similarly, PAF 4@ implies PAFg. O 


We shall discuss this proof a little later. First, we wish to prove the 
Second Incompleteness Theorem. For this, we need some notation: 


6.2.2. DEFINITION. Let Pt, 3 be models of PA. If Vt is definable in Mt (even 
in the weak sense that the atomic relations of Yt are Jt-definable), we write 
Me<aM. 

The Hilbert-Bernays Completeness Theorem yields immediately the 
fact: If 30 PA+Conpa, then there is an Xt such that P< MN. 

The usefulness of this notion is given by the following. 


6.2.3. LEMMA. Let Dt, 2 be models of PA, W<aN. Then Wt is definably an 
end-extension of St — i.e. there is a unique Q-definable isomorphic 
embedding of N into M as an initial segment of WM. 
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Proor. The proof is straightforward: 0» obviously maps onto Om. Extend 
the map, say F, by 


F(Syx) = Sa Fx ), 


where S denotes the successor functions of the models. The recursion 
equations in Yt and induction in ¥t verify that F is an isomorphism of Xt 
onto an initial segment of M®@. OF 


Using this lemma, we may present Kreisel’s proof of the following. 
6.2.4. SECOND INCOMPLETENESS THEOREM. PA ¥ Compa. 


Proor. Let go, g1,... be an enumeration of sentences of the language 
described in the proof of 6.1.1. That proof can be viewed as an attempt to 
choose an infinite consistent path through the following tree: 


/ \ fs 


“V1 “TW¢1 


fs ™% 2 Fs 


“VP2 G2 “T@2 NP2 P2 “T@2 


We may assume for definiteness that the construction proceeds by taking 
the leftmost consistent path. Choose ¢ such that PAF g @ 4 Tr»('g'), for 
the truth definition, Try, of the model constructed. Let g = ¢,, in the 
enumeration. The tree, as defined to the no-th level, is absolute. I.e. it is the 
same in every model. (Note: This is not true for the infinite tree simply 
because any non-standard model Yt will encode a level for gy for 
non-standard integers N. But the finite trees are fixed.) 

Assume PAt Conpa. Let Jt) F PA. Then there is a model, 9t,, definable in 
No: Ni <aMNo. But M, is also a model of Conp, and there is an Vt,< 4M. 
Repeating, we get an infinite sequence, 
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No maMsy maMe mare’, 
such that (say) 
MoE Gay WiE Wem Wee Gag ess 


We now use construction to derive a contradiction. Given %t,, let 
¢' = (ei, ei",..., eae) denote the portion of the path used in construct- 
ing ¥t;., — where yo! = g, o; = 4, and e, € {0, 1}. Recall that g' is the 
leftmost consistent path (as viewed) in YIt,. 

Either using facts that Prp, is 2, and that %, sentences are preserved 
under end-extensions, or using D2 and the fact that 


PA + Congat Vu (Prea('s') > Trtm (‘w')), 


one sees that g‘*' can never lie to the left of g'. For, once Yt; says that a 
sequence is inconsistent, every resulting ¥; will also assert its inconsis- 
tency. Put differently, larger models can allow new proofs (even of 
inconsistencies) e.g. by means of non-standard axioms encoded by infinite 
integers; but they cannot erase old proofs. 

Thus g‘*' cannot lie to the left of ¢'. Furthermore, g'*' # g' since 


(a) angie = @ Nema! <> 19 mot, 


Thus the path g‘*' lies properly to the right of @'. 

But the tree determined by go,..., @» iS finite and there are only 2"! 
different paths through this tree. This contradicts the assumption 
PAtConea by which we obtained an infinite sequence of paths: 


¢’.¢'..... O 


*6.3. Comments 


With Theorems 6.2.1 and 6.2.4, we have gone full circle: We have gotten 
back to the results with which we began this chapter. The present proofs 
differ somewhat from the originals and it is worth making a few compari- 
sons. 

Let us first comment on the forms of the independent sentences given by 
the two proofs of the First Incompleteness Theorem. ‘“The’” sentence 
which asserts its own unprovability is 

(i) unique up to provable equivalence; 

(ii) Il, and hence true. 

“The” sentence asserting its falsity in the model constructed is 
(i') not unique — for, if gp @—7Trs('¢'), then 


Ae entra (lA e!); 
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(ii’) A, and, by (i’), there is no obvious way of deciding its truth or 
falsity. 

(ii’) can be gotten around as follows: One of g, —@ is true. Let 
Ax Vyyxy be the 2%, form of the true statement. Then, for some n, 
x =Vydny is true. But PAY y as one would then have PA} Ax Vy xy. 
But PAY —y since y is true. 

The model-theoretic proof of the First Incompleteness Theorem given 
here is similar to the classical one — for, once one assumes completeness, 
Tra is the same as Prp,. The model-theoretic proof of the Second 
Incompleteness Theorem differs radically from the classical one and we 
note some differences in the sort of information they yield: 

(i) The classical proof readily yields the formalized version, 
PAt Conpa— Conpa+-conp,. Further, it applies directly to weaker theories 
like PRA. 

(ii) While the classical proof yields the existence of some model in which 
Cones fails, the model-theoretic one shows that, for any presentation of the 
Henkin construction (as given by the encoding, the enumeration 
Po, P1,...,etc.), there is a number m such that, for any model Xt of PA, the 
sequence 


(*) MmaNi mares, 


determined by the given presentation, must stop after fewer than m steps 
with a model in which Conp, is false. (Of course, by the classical proof, 
there is a presentation of the Henkin construction with a very short 
sequence (*) — simply let go = 1 Conp,. The present proof works for all 
enumerations @o,... .) 


References 


The following is a very biased selection of the many papers on the topic 
of this chapter. It includes some papers whose contents were not discussed. 


FEFERMAN, S. 
[1962] Transfinite recursive progressions of axiomatic theories, J. Symbolic Logic, 27, 
259-316. 
GODEL, K. 
(1931] Uber formal unentscheidbare Satze der Principia Mathematica und verwandter 
Systeme, I, Monatsh. Math. Phys., 38, 173-198. 
HASENIJAGER, G. 
[1953] Eine Bemerkung zu Henkin’s Beweis fiir die Vollstandigkeit des Pradikatenkalkiils 
der ersten Stufe, J. Symbolic Logic, 18, 42-48. 


REFERENCES 865 


HILBERT, D. and P. BERNAYS 
[1970] Grundlagen der Mathematik, | (Springer, Berlin, 2nd ed.). 
JERosLow, R.G. 
(1973] Redundancies in the Hilbert-Bernays derivability conditions for Gédel’s second 
incompleteness theorem, J. Symbolic Logic, 38, 359-367. 
KENT, C.F. 
[1973] The relation of A to Prov'A! in the Lindenbaum sentence algebra, J. Symbolic 
Logic, 38, 295-298. 
KREISEL, G. and A. Levy 
[1968] Reftection principles and their use for establishing the complexity of axiomatic 
systems, Z. Math. Logik Grundlagen Math., 14, 97-142. 
KrelsEL, G. and G. TAKEUTI 
(1974] Formally self-referential propositions in cut-free classical analysis and related 
systems, Dissertationes Math., 118, 1-50. 
Lés, M.H. 
[1955] Solution of a problem of Leon Henkin, J. Symbolic Logic, 20, 115-118. 
MESCHKOWSKI, H. 
[1973] Hundert Jahre Mengenlehre (Deutscher Taschenbuch Verlag, Munchen). 
Rep, C. 
[1970] Hilbert (Springer, Berlin). 
Rosser, J.B. 
[1936] Extensions of some theorems of Gédel and Church, J. Symbolic Logic, 1, 87-91. 
SAMBIN, G. 
[0000] An effective fixed-point theorem in intuitionistic diagonalizable algebras, Studia 
Logica, to appear. 
SMORYNSKI, C. 
[0000] w-consistency and reflection, in: Proceedings of the 1975 Logic Colloquium at 
Clermont-Ferrand, to appear. 
SoLovay, R. 
[1976] Provability interpretations of modal logic, Isr. J. Math., 25, 287-304. 


This page intentionally left blank 


D.2 


Proof Theory: 
Some Applications of 
Cut-Elimination 


HELMUT SCHWICHTENBERG 


Contents 


2. Cut-elimination for first-order logic . 
3. Transfinite induction 

4. 

5. Transfinite induction and the reflection principle 


Introduction . 


Bounds from proofs of existential theorems . 


References . 


868 
871 
876 
884 
892 
894 


HANDBOOK OF MATHEMATICAL LOGIC 


© North-Holland Publishing Company, 1977 


867 


Edited by J. Barwise 


868 SCHWICHTENBERG / CUT-ELIMINATION (cH. D.2, 81 


1. Introduction 


1.1. Proof theory began with Hilbert’s Program, which called for elemen- 
tary consistency proofs for formalized mathematical theories S. Equiva- 
lently (under quite general conditions discussed in Chapter D.1) this 
program can be formulated as follows. Given a formalization in S of an 
abstract proof of an elementary assertion g (example: proof of n+ m = 
m+n, n,m variables for natural numbers, in an axiomatic set theory), can 
one always conclude from this by elementary means that is true? Or 
more precisely, can one give an elementary proof of the schema 


>) 3x Ders(x, 'g')> ¢, 


where Ders(-,-) is a canonical representation of the derivation predicate 
for S and ¢ ranges over formulas corresponding to elementary assertions? 
By the well-known second incompleteness theorem of Gédel, discussed in 
Chapter D.1, (*) is underivable in S, provided S is sufficiently strong. Now 
since one would expect that a strong theory S contains at least formaliza- 
tions of all “elementary” proofs, one may fairly say that this refutes 
Hilbert’s Program in its original form. However, one can also try to extend 
the (originally quite vague) conception of an elementary proof and then 
look for such a proof of (*) not formalizable in S; in fact, this was Hilbert’s 
reaction to Gédel’s result (cf. the introduction to HitBeErT and BERNAYS 
[1934]). We shall not deal here with contributions to Hilbert’s Program 
along these lines (for this, cf. e.g. SCHUTTE (1960]), but rather concentrate 
on some less delicate questions which are derived from and closely related 
to Hilbert’s Program. 


1.1.1. A theory S is called conservative over a theory T if any formula of 
L(T) (the language of T) derivable in S is already derivable in T. Note that 
this would be a corollary of the derivability of (+) in T (under quite general 
conditions). There are numerous important and nontrivial examples of 
theories S conservative over a subtheory T. Some of these are discussed in 
Chapters D.4 and D.5. We shall give here a very simple example and show 
that first-order logic is conservative over its part which uses formulas of a 
restricted complexity only (cf. Section 2.8). 


1.1.2. The schema (*) (now taken with arbitrary ~) provides, generally, a 
proper extension of S. However, (*) has a metamathematical character and 
its mathematical strength is difficult to judge. So one might ask for an 
equivalent formulation of (*) having a clear mathematical meaning. This 
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question has been answered for a wide variety of theories S. We shall 
confine ourselves here to a basic example, namely (classical) arithmetic Z, 
and prove that in this case (a version of) (*) is equivalent to the schema of 
transfinite induction up to &». 


1.2. Our second starting point is a question which only more recently came 
to the attention of proof theorists (cf. KREISEL [1958]): ‘‘ What more do we 
know if we have proved a theorem by restricted means than if we merely 
know that it is true?’’ Again we shall confine ourselves to the discussion of a 
basic example, where the “restricted means’’ are those formalized in 
arithmetic Z. We shall obtain a complete answer to the above question, due 
to KreiseEL [1952]. For some subsystems of analysis one can also get 
satisfactory answers to questions of the type above; for this we refer the 
reader to Chapter D.4. 


1.3. From a more technical point of view, we survey some elementary 
applications of a basic technique in proof theory: the method of cut- 
elimination. This method is due to Gentzen and was later developed 
particularly by Schiitte and Tait (cf. ScHUTTE [1960] and Tarr [1968]). Other 
techniques frequently used in proof theory are adequately covered in other 
chapters in this volume. Especially important is the method of functional 
interpretation due to GopeL [1958], which is treated in Chapter D.5. 


1.4. We now give a more detailed account of the content of the present 
chapter. 

In Section 2 we prove the Cut-Elimination Theorem for first-order logic; 
as a corollary we obtain the conservative extension result mentioned 
above. The proof of this basic Cut-Elimination Theorem is set up in such a 
way that it can be easily generalized to many other cases where a 
cut-elimination argument is applied, in particular to those treated here. 

In Section 3 we discuss for arithmetic Z the provability and unprovability 
of initial cases of transfinite induction. The result (due to GENTZEN [1943}) 
is well known: Given a natural well-ordering < of order type €o, then with 
respect to < transfinite induction is provable up to any ordinal < éo, but 
not up to &g itself. 

The underivability in Z of transfinite induction up to €o will also follow 
from Gdédel’s second incompleteness theorem together with the fact that 
transfinite induction up to &p suffices to prove the reflection principle for Z 
and hence the consistency of Z (cf. Section 5). Here we give a direct proof 
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of this underivability result, using a cut-elimination argument. Technically, 
this provides an easy and convincing example of the usefulness of infinite 
derivations and the strength of the cut-elimination method when applied to 
infinite derivations. 

In Section 4 we take up the question of Section 1.2. We first consider the 
special case of Wd-formulas. Suppose Wn dmg(n,m) with (n,m) 
quantifier-free is derivable in Z. We shall show that then we can find a 
function F satisfying Wng(n, F(n)) which has a somewhat limited rate of 
growth: F can be defined by primitive recursive operations and a- 
recursions for @ < &o. 

We then turn to the general case of arbitrary Z-formulas. At first sight a 
generalization of the result for Va-formulas seems to be impossible, since 
Wn dm Wk (T(n,n,k)— T(n,n,m)) is derivable (in classical logic and 
hence) in Z, but there is no recursive function F _ satisfying 
Vn Wk (T(n,n,k)— T(n, n, F(n))) (this would contradict the recursive 
undecidability of 4k T(n,n,k); T is Kleene’s T-predicate). However, 
there is such a generalization, the so-called No-Counterexample- 
Interpretation due to Kreiset [1952]. To explain it let us first consider a 
formula of the above form, i.e. ¥:=Wn dm Vk p(n, m, k) with p(n, m, k) 
quantifier-free. Its negation is equivalent to In Vm Ak 4 ¢(n, m,k) and 
hence (using the axiom of choice) also to dn, f Vm — ¢(n, m, f(m)); such 
n, f can be considered as providing a counterexample to the given formula 
Ww. So a way to express the content of w is to say that there is no such 
counterexample, i.e. that for any nf we have Imo(n,m,f(m)) (this 
formula is the Herbrand normal form of w), i.e. that there is a functional F 
such that Vn, fo(n, F(n, f), f(F (1, f))) holds. Now the additional informa- 
tion we obtain from the fact that & is derivable in Z is that such a functional 
F can be found which again has a somewhat limited complexity: F can be 
defined by primitive recursive operations (in the sense of KLEENE [1959]) 
and a@-recursions for some @ < &9, or — as we shall say — F is < €&- 
recursive. 

Generally, let & be an arbitrary Z-formula and let Ju= Imy"(n, m, f) be 
its Herbrand normal form which is derivable in Z iff & is. (We use f for 
finite sequences of function variables and n,m for finite sequences of 
number variables.) The result then is that from the derivability of # in Z we 
can conclude that there are <_ e-recursive functionals F satisfying 
Vn, f"(n, F(n, f), f). We also prove that this result is the best possible in 
the sense that no smaller class of functionals suffices. 

The proof involves a new point: it makes use of the fact that the 
cut-elimination procedure for infinite derivations is an effective operation. 
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More precisely, we show that for a natural coding of infinite derivations the 
cut-elimination procedure is given by a primitive recursive function. 

In Section 5 we come back to the question asked in Section 1.1.2 and 
prove the result stated there (which is due to KreIseL and Levy [1968)). 
The proof is a formalization of the argument in Section 4, i.e. cut- 
elimination for codes of infinite derivations. 


Acknowledgements: Parts of the present chapter are based on other 
sources, in particular TaiT [1968] (for the proof of the Cut-Elimination 
Theorem in Section 2) and SCHUTTE [1960] (for the proof in Section 3 of the 
underivability of transfinite induction up to €o in Z). Also I want to thank S. 
Feferman, G. Kreisel, R. Statman and A.S. Troelstra for many helpful 
comments and suggestions; in particular, the idea to prove the No- 
Counterexample-Interpretation by means of a cut-elimination argument is 
due to Kreisel. 


2. Cut-elimination for first-order logic 


We prove this basic Cut-Elimination Theorem by a method due to 
Gentzen which is central for our later work: nearly all the results 
mentioned in the introduction will be obtained by generalizations of this 
method. Technically we shall follow Tait [1968] quite closely, but with one 
exeption: we shall avoid infinite formulas throughout (and later use infinite 
derivations only where they seem to be essential). 


2.1. We use the ordinary language of first-order logic, for simplicity in the 
following version: formulas are built up from atomic and negated atomic 
formulas by means of a, v, Wx, dx. The negation ¢ of a formula ¢ is 
defined to be the formula obtained from ¢ by 
(i) putting a — in front of any atomic formula, 
(ii) replacing a, v, Wx, dx by v, a, dx, Vx, respectively, and 
(iii) dropping double negations. 
This treatment of negation is possible since we assume classical logic 
throughout. Note that 7 —@ is identical with ¢, 7 ¢ = ¢. As usual, we 
define 9 > & to be Ne vu and gp ey to be (Pp > H) A(W—> g). Let |g] 
(the length of gy) be defined as follows. 
(i) le |=|-¢|=0, for g atomic. 
(ii) |p Awl|=le v bl =sup(lel,|e])+1. 
(iii) |Vx o(x)| = |Axe(x)|=|e(x)|+1. 
Note that | ¢|/=|9|. 
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2.2. Logical rules 


We shall derive finite sets of formulas, denoted by F, A, A, F(a),.... The 
intended meaning of I is the disjunction of all formulas in I. We use the 
notation 


I,g for FU{¢}, 
r,A for PUA. 
(i) Normal rules: 


A I,¢g, 9 if @ is atomic. 


, be he 
Tea 
“a Cg Re 


Tovw’ “' Tove’ 
ele) if x is not free in I 


Vv : ; 
*9(x) (x is called eigenvariable of V). 
T, g(s 
F,3axe(x)° 
(ii) Cut-rule: 
Cut fe ie . 


The principal formulas (p.f.) in A are g and —@. In a, v;, W and 3 the pf. 
is pra, pv, Vxo(x) and 3x¢(x), respectively. Cut has no p.f. The 
minor formula (m-f.) in the premiss I, ¢ of a is g, and in the premiss I, & 
of A it is f. In the premiss of Vo, v1, VW and 3 the mf. is g, &, g(x) and ¢(s), 
respectively. The m.f. in the premiss I, g of Cut is g, and in the premiss I, 
—“ ¢ of Cut it is —g. So any inference has the form 


(*) ¢g_ foralli<k 
r,A 


(0=k =2), where A consists of the p.f. and g; is the mf. in the i-th 
premiss. The formulas in F are called side formulas (s.f.) of (*). 
Derivations are built up in tree form, as usual. More precisely, they are 
defined by the following induction. Consider an inference (*) as above and 
assume that derivations d, of its premisses I, g, are given. Then d = 
((d:i ick. (i )icz, 4, I) is a derivation of the conclusion I,A of (+). The 
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inference considered is called the last inference of d. The d; are called direct 
subderivations of d. We write dt I if d isa derivation of IF, and+ I if there 
is a derivation of I. 

The length |d| of a derivation d is inductively defined to be 
supi<.(|d,|+ 1) if the d, i<k, are the direct subderivations of d. Hence 
|d|=0 if d has no direct subderivations. The cut-rank p(d) of a derivation 
d is also defined by induction: Let d, i < k, be the direct subderivations of 
d. If the last inference of d is a cut with mf. g and —@ let 
p(d):= sup(|¢|+ 1, supi<. p(d,)). Otherwise, let p(d):= sup... p(d,). Note 
that p(d)=0 iff d is cut-free. 

It is convenient to use the notions of free and bound occurrences of 
variables in derivations. A free occurrence of a variable x inside an 
occurrence of a formula in a derivation d is called bound in d if ‘‘below”’ 
that occurrence x is used as an eigenvariable of an inference V; otherwise 
this occurrence of x is called free in d. We use the notation d, d(x),... for 
derivations where it is understood that there may be other free variables 
different from those actually shown. 


2.3. Let d, [ be obtained from a derivation d by adding I to the side 
formulas of all inferences in d. It is trivial to see that d, I” is again a 
derivation provided no variable free in I” is bound in d. The latter 
condition can always be assumed to hold if we identify derivations which 
differ only by a change of bound variables. Hence we have: 


2.3.1. WEAKENING Lemma. If dt A, then d, [+ I, A with |d, |=|d| and 
p(4, T°) = p(d). 


2.4. Let d(s) denote the result of substituting s for all free occurrences of x 
in d(x) (note that some changes of bound variables in d(x) may be 
necessary). Then we obviously have 


2.4.1. SusstiruTION Lemma. If d(x)+ I(x), then d(s)+ I'(s) with |d(s)|= 
|d(x)| and p(d(s)) = p(d(x)). 


2.5. INVERSION LEMMA. (i) If dt I, go Agi, then we can find d; +I, ¢; 
(i =0,1) with |d,|=|d| and p(d,) = p(d). 

(ii) If dt I’, Wxb(x), then we can find dot T, w(x) with |do|=|d| and 
(do) = p(d). 


Proor. The proofs of (i) and (ii) are almost identical, both by induction on 
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|d|. We restrict ourselves to (ii). Let g be Wx w(x). We can assume 9 ¢ I, 
for otherwise the result follows by weakening, taking d, (x). 

Case 1: g isnot ap.f. in the last inference of d. Then this inference has 
the form 


A, ¢, for all j <k 

A, 9, A 
with m.f. yy, p.f. A ands.f. A, g, and I = A, A. By the induction hypothesis 
FA, W(x), y for all j<k, with length <|d| and cut-rank =< p(d). The 
result follows by the inference 


A, &(x),% for all j <k 
A, p(x), A 


Case 2: » isap.f. in the last inference of d. We can assume that ¢ is a 
s.f. in the last inference of d, replacing d by d,q@ if necessary. So that 
inference is of the form 


Lg, w(x 
Ie 


with m.f. w(x), p.f. @ ands.f. I, g. By the inductive hypothesis + I, w(x), 
with length <|d| and cut-rank <p(d). This completes the proof. O 


2.6. REDUCTION LEMMA. Let dot I, g and d,+ A, “9, both with cut-rank 
p(d,)<=|g|. Then we can find d+ I, A with |d|<|d,|+|d,| and p(d)s 


le |. 


Of course we could derive I, A by an application of the cut-rule, but the 
resulting derivation would then have cut-rank |g|+1. 


Proor. The proof is by induction on |do|+|d,|. Since |g |=|—¢| and 
7 ¢ = 9, the lemma is symmetric with respect to the two given 
derivations. 

Case 1: Either ¢ is not a p.f. in the last inference of d, or else 1 @ is 
not a p.f. in the last inference of d;. By symmetry we can assume the 
former. Then the last inference of d, is of the form 


A,g,%  foralli<k 

A,¢,@ 
with m.f. &, p.f. O ands.f. A, g, and I = A, @. By the induction hypothesis 
FA, A, y for all i<k with length <|d.{+|d,| and cut-rank <|g|. The 
result then follows by the inference 
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A, A, for alli<k 
A, A, 9 


Case 2: ¢ isap.f. in the last inference of do, and — ¢ isa p.f. in the last 
inference of d,. 

Case 2.1.: ¢ or “@ is atomic. Then the last (and only) inferences of do 
and d, are instances of the rule A and hence I, A is also an instance of the 
rule A. 

Case 2.2.: g or 4@ is a disjunction gov ¢:. By symmetry we can 
assume the former, So  ¢ is 1 go A 4 g;. We can assume that g is as.f. of 
the last inference of do, replacing dy by do, g if necessary. So that inference 
is of the form 

LI, @ gi 
Ne” 
By the induction hypothesis + I, A,g, with length <|d.{+|d,| and cut- 
rank <|g|. By the Inversion Lemma + 4, “gg, with length <|d,|< 
|do|+|d,| and cut-rank <|¢g|. The result follows by an application of the 
cut rule. 

Case 2.3.: g or @ is of the form 3x w(x). Again we can assume the 
former (so  @ is Vx 4 y(x)), and also that » is as.f. of the last inference 
of do. So that inference is of the form 

I, g, w(s) 
Te - 
By the induction hypothesis | I, A, #(s) with length <|d,|+ |d,| and 
cut-rank <|g|. By the Inversion Lemma + A, — y(a) with length =|d,| 
and cut-rank <|g|. By the Substitution Lemma + A, 4 y(s) also with 
length <|d,| and cut-rank <|g|. The result follows by an application of 
the cut rule. O 


2.7. CuT-ELIMINATION THEOREM. If d+ I and p(d)>0, then we can find 
d't I’ with p(d')< p(d) and |d'|=2". 


Proor. The proof is by induction on |d|. We may assume that the last 
inference of d is a cut 


¢ TKa¢ 
r 


with [g{+1=p(d), for otherwise the result follows by the induction 
hypothesis (making use of the fact that our rules all have finitely many 


876 SCHWICHTENBERG / CUT-ELIMINATION [cH. D.2, §3 


premisses). So assume this. Let dot I, g and dit I, -¢ be the direct 
subderivations of d. By the induction hypothesis we have dot I, » and 
d{+ I, —@, both with cut-rank p(d') =| | and length | d;| < 2'*!. The result 
then follows by the Reduction Lemma, since | d6|+|di|<2™?(4e!4:0*! = 
Pea Oe) 


Let 2§=  26.,= 274 


2.7.1. CoroLLary. If d+ I, then we can find a cut-free d*+I with 
|d*| = 20). 


2.8. In this and the next subsection we prove two important consequences 
of the Cut-Elimination Theorem. 

Define the relation “yy is a subformula of g”’ to be the smallest transitive 
and reflexive relation with the properties 

(i) go, gy: are subformulas of goA 1, Po V Gi, and 

(ii) g(s) is a subformula of Vx e(x), dx ¢(x). 
The following is obvious. 


2.8.1. SUBFORMULA Property. Let d be a cut-free derivation of !. Then any 
formula occurring in d is a subformula of one of the formulas in I. 


Hence from the Cut-Elimination Theorem we can conclude that for any 
dtI we can find d*+ TI containing only subformulas of formulas in I. 


2.9. HERBRAND'S THEOREM. Let d+ 3x p(x) with p(x) quantifier-free. Then 
we can find terms So,...,S,-1 and a derivation dot ~(So),***, @(Sn-1). 


Proor. We can assume that d is cut-free. Hence by the subformula 
property any instance of the rule 4 in d has the p.f. dx g(x). Let 50, ..., Sn-1 
be all the terms such that ¢(s;) is the m.f. of such an instance of 4. Now add 
¢(So),-.., (Sn-1) to the side formulas of any inference in d, and cancel all 
occurrences of 4x g(x) in d. It is easy to see that the resulting object is 
(essentially) the required derivation. O 


3. Transfinite induction 


In this and the following sections we shall deal with (classical) arithmetic 
Z. We begin with a discussion of transfinite induction, particularly of the 
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question which initial cases of transfinite induction are derivable in Z. By 
an extension of the cut-elimination argument in Section 2 we shall show 
that transfinite induction up to & is underivable in Z. This provides a 
precise bound, since it is easy to see that for any a@ < &o transfinite 
induction up to @ is provable in Z (cf. ScHUtre [1960] or Chapter D.4). 


3.1. To fix notation we first describe our version of arithmetic Z, which is in 
fact usual arithmetic plus free set and function variables. So we have 
number variables, set variables and for any n >0 variables for n-place 
functions (countably many of each sort). They are denoted by k, m, n, p, by 
X, Y, Z, and by f, g, h, respectively. The terms of Z are built up from a 
constant 0 (for the number 0) and the number variables by means of the 
function symbols S$ (for successor), +,* and the function variables. The 
atomic formulas of Z are of the form s = t, s<t or s € X, where s, ¢ are 
terms and X is a set variable. The formulas are built up from these as 
usual, using quantification on number variables only. 

The axioms of Z are the usual axioms for 0,8,< (—n<0O, 
m<Sno(m<nvm=n)), +,: and equality, and the induction schema 


g(0) a Wn (y(n) g(Sn))> Vn (n), 


where y(n) is an arbitrary formula of the language, possibly containing 
additional variables. The theorems of Z are those derivable from the 
axioms by classical logic. 

The various sets and functions one wants to talk about in arithmetic can 
be introduced in definitional (and hence conservative) extensions of Z. 
There is one type of these we are particularly interested in, the so-called 
recursive extensions of Z. Such an extension occurs if 

(i) we introduce a new set symbol M with the defining axiom n € M 
< y(n) where y(n) is quantifier-free, or 

(ii) if we have derived 3m ¢(n, m, f) with g(n, m, f) quantifier-free and 
then introduce a new functional symbol F with the defining axioms 


p(n, F(n,f),f),  m<F(n, f)>¢(n,m, f). 


Z’ is called a recursive extension of Z if it is obtained from Z by a finite 
sequence of definitional extensions of this sort. Recursive extensions of Z 
will also be denoted by Z. 

One can show that any primitive recursive function can be introduced in 
a recursive extension of Z (cf. SHOENFIELD [1967]). Conversely, any such 
function is certainly recursive. We will determine in Section 4 exactly which 
recursive functionals can be introduced in recursive extensions of Z. 
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Obviously Z is a conservative extension of its part without function 
variables, or without set variables, or without both. These subsystems will 
also be denoted by Z. 


3.2. Natural well-orderings of order type &€, 


As is well known, the ordinals < ¢, can be built up from 0 by means of 
the ordinal functions a + B and w*. This build-up is unique if one uses the 
Cantor normal form (cf. BACHMANN [1955]). Hence ordinals < ¢) may be 
considered as finite objects and so they can be coded by natural numbers. It 
is easy to choose these codes in such a way that 

(i) the coding provides a bijective mapping a » ‘a! from the ordinals 
< € onto the natural numbers, 

(ii) the relation n < m corresponding to the less-than relation between 
ordinals < &€ is primitive recursive, and 

(iii) the number-theoretic functions corresponding to the ordinal func- 

tions a + B, w* and their inverses are primitive recursive. 
Obviously, any two codings with the properties (i}(iii) will be primitive 
recursive isomorphic. Any of the corresponding <-relations between 
natural numbers is called a natural well-ordering of order type &. We 
choose one of them, denote it by < and fix it for the following. We write 
n<m forn<mvn=m. 


3.3. Let Prog(X) (‘‘X is progressive’) be the formula Vn (Wm(m <n 
—>mE€EX)—>ne€EX). The axiom of transfinite induction up to € is 


TI..(X) Prog(X)— Wn (n€ X). 


Here m <n stands for (m,n)€ M where (:-,-) is one of the usual 
primitive recursive pairing functions and M is a symbol for the primitive 
recursive set of pair-numbers (m,n) such that m <n holds. 


3.3.1. THEOREM (GENTZEN [1943]). TI,,(X) is underivable in arithmetic Z. 


The proof of this theorem will cover the rest of Section 3. In outline, it 
proceeds as follows. We first embed Z in a ‘‘semi-formal” system Z., where 
induction is replaced by a rule with infinitely many premisses, the so-called 
w-rule: 


r,A Io, ... i, for all io,...,i-1<@ 
TV, A( no, -.-, M1) 
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where 7 is the i-th numeral, i.e. S'0. By a slight extension of the argument 
in Section 2 we will obtain a Cut-Elimination Theorem for Z. which gives a 
bound on the length of the cut-free derivation in terms of length and 
cut-rank of the derivation we started with. In particular, if we started with 
(the image in Z.. of) a Z-derivation, then this length will be < €o. We will 
then extend Z by yet another infinitary rule, the so-called progression rule 
introduced by Schiitte: 


tex foralli<j 


Prog sex 


where s is a closed term with numerical value j. It is easy to see that in 
Z..+ Prog one can give a derivation of Prog(X), and that this derivation has 
a finite length. Again a Cut-Elimination Theorem with the same ordinal 
bounds holds for Z..+ Prog. Now assume that TI.,(X) is derivable in Z. 
Since Prog(X) is derivable in Z.+ Prog with finite length, we can conclude 
that the formula n € X (with variable n) is cut-free derivable in Z.. + Prog 
with a length a < eo. Hence also 'a + 1'€ X is derivable in Z.. + Prog with 
length a. But this is a contradiction, since from the form of the rules of 
Z.+ Prog it follows immediately that any cut-free derivation of iBlex 
has length B. 


3.4. Cut-Elimination for Z.. 


3.4.1. Description of Z. 

The language of Z. is the same as for Z; we can assume here that we do 
not have function variables. For notions connected with derivations we use 
the same notation as in Section 2. 

A finite set 4 of formulas is called a Z.-axiom if A consists of atomic or 
negated atomic formulas without number variables such that VA (the 
disjunction of the formulas in A) is a tautological consequence of substitu- 
tion instances of the quantifier-free axioms of Z. 

The normal rules of Z. are 


A [,A if A isa Z.-axiom, 


the rules A, Vo, v1, VW, 5 listed in Section 2 and the w-rule 


V,4(@) for alli 
r, A(n) 


Furthermore, we have in Z.. the cut-rule Cut stated in Section 2. 
Note that in the w-rule we allow n to be empty. Also it is allowed that in 
A(m) no variable of n actually has a free occurrence. In these cases the 
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conclusion of the w-rule is the same as its premiss(es). Such an instance of 
the w-rule is called improper. 

The prinicipal formulas (p.f.) in A are all formulas in A. In the w-rule the 
p.f. are all formulas in A(n). The minor formulas (m.f.) in the i-th premiss 
of the w-rule are all formulas in A (7). So any inference now has the form 


(*) [,A; for alli<a 
T,A 

(0=a=w), where A consists of the p.f. and A; consists of the m.f. in the 
i-th premiss. The formulas in I are again called side formulas (s.f.) of (*). 

Derivations will now be infinite; they are defined as in Section 2.2. (In 
the case of the w-rule we have to add information about the variables n.) 
Also the other notions introduced in Section 2.2, particularly the length | d| 
and the cut-rank p(d) of a derivation d carry over with the same 
definitions. Note that |d| is now a countable ordinal, and p(d) =< w. We 
restrict ourselves throughout to derivations with only finitely many free 
and bound variables. The set of variables free in a derivation d is denoted 
by Var(d). 


3.4.2. EMBEDDING LEMMA. For any g derivable in Z we have a Z.- 
derivation d+ of length |d|<w-2 and cut-rank p(d)< w. 


This is easy to see for the axioms of Z (for induction one has to use the 
w-rule), and it is trivially preserved by the logical rules. 


3.4.3. We now extend the proof given in Section 2 of the Cut-Elimination 
Theorem for first-order logic to Z.. Obviously we have: 


WEAKENING Lemma. If d+ A, then d, D+ I, A with |d, |= |d|, p(d,l)= 
p(d) and Var(d, I’) = Var(d) U V, where V is the set of variables free in I. 


Note that any closed term s has a numerical value i, and s =7 is a 
Z.-axiom. 


EVALUATION LEMMA. Let s, t be closed terms, both with the same value i. If 
dtI(s), then we can find dot I(t) with |do|=|d|, p(do)= p(d) and 
Var(do) = Var(d). 


It is easily seen that this holds for instances of the rule A and is preserved 
by the other rules. 
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SupsTituTION Lemma. If d(n)tI'(n), then d(s)+I'(s) with |d(s)|= 
|d(n)|, p(d(s)) = p(d(n)) and Var(d(s)) C (Var(d) — {n}) U V, where V is 
the set of variables free in s. 


The proof is by induction on | d(n)|. The only case which requires some 
argument is that in which the last inference of d(n) is an instance of the 
w-rule of the form 


I'(n, m, p), AGG, p) for all i,j 


I'(n, m, p), A(n, m, p) 


where m, p include all variables free in s=s(m,p) (but I"(n,m,p), 
A(n, m, p) may contain free variables other than those shown). By the 
induction hypothesis, 


tI'(i,m,k),A(i,j,k) for all i j,k 


with length <|d(n)| and cut-rank <p(d(n)). From some of these 
derivations we obtain by the Evaluation Lemma, 


tI'(s(j, k), m,k), A(s(j,k),3,k) for all j,k 


without raising length or cut-rank. The result follows by an application of 
the w-rule. 


INVERSION LemMa. (i) If d+ I, go A gi, then we can find d,+ I, y; (i = 9,1) 
with |d:|=|d|, p(d:)<p(d) and Var(d,) C Var(d). 

(ii) If d+ Ir, Wnw(n), then we can find dot I, w&(n) with |do|<|dl, 
p(do) = p(d) and Var(do) C Var(d) U{n}. 


The proofs of (i) and (ii) are almost identical, both by induction on |d|. 
We restrict ourselves to (ii). The only subcase not similar to 2.5 is where the 
last inference of d is an instance of the w-rule. Then that inference is of the 
form 


A(m), o(m), A(t), p(t) for alli 
A(m), A(m), p(m) 


with m.f. A(t), ¢(7), p.f. 4(m), o(m) and s.f. A(m), o(m), and [= A(m), 
A(m), @ =@(m). By two applications of the induction hypothesis 


FA(m), &(n,m), A(t), W(n,t) for alli 


with length <|d{ and cut-rank <p(d). The result follows by an applica- 
tion of the w-rule. 
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Repucrion Lemma. Let dot I, g and d,+A, 1, both with cut-rank 
p(d:)=|g|. Then we can find d+ A, I with p(d)<|¢|, |d| | do] #|d,| 
and Var(d) C Var(dy) U Var(d,). 


Here # denotes the natural (or Hessenberg) sum of ordinals (cf. 
BACHMANN [1955]); # is strictly monotonic in both arguments. 


The proof is by induction on | d.|#|d,|. Again the only (sub-) case not 
similar to 2.6 is where @ is a p.f. in the last inference of do, and — ¢ isa p.f. 
in the last inference of d,, and the last inference of dy or d; is an instance of 
the w-rule. By symmetry we can assume the former. We can also assume 
that ¢ isas.f. of the last inference of do, replacing do by do, ¢ if necessary. 
So that inference is of the form 


A(m), g(m), O(t), o(i) for alli 
A(m), O(m), p(m) 


with m.f. O(7), g(7), p.f. O(m), p(m) ands.f. A(m), p(m), and = A(m), 
O(m), ¢ = o(m). By the Substitution Lemma + I'(i), g(i) for all i, with 
length <|d,| and cut-rank <|g|, and also + A(t), - ¢(f) for all i, with 
length <|d,| and cut-rank <|g|. By the induction hypothesis + (7), 
A(i) for all i with length <|d,|#[d,| and cut-rank <|g|. The result 
follows by an application of the w-rule. This completes the proof of the 
Reduction Lemma. O 


Let e(a@) be the a-th e-number. 


Cut-ELIMINATION THEOREM. (i) If d+ IF with p(d) = {+ 1, then we can find 
d't IT with p(d')s@, |d'|=2' and Var(d')C Var(d). 

(ii) If d+ I with p(d)=, then we can find d'tI with p(d')=0, 
|d'|=e(|d|) and Var(d')C Var(d). 


Proor. {i) As in 2.7. 
(ii) By induction on |d|. We may assume that the last inference of d isa 
cut 


Tg a6 
r 
for otherwise the result follows by the induction hypothesis. So assume 
this. By the induction hypothesis we have dot I, g and d,+ I, “4g, both 
cut-free and with length | d,|< e(|d|). The result then follows by applying 
(i) |e| times. O 


cH. D.2, §3] TRANSFINITE INDUCTION 883 


Corotiary. If d+ I, then we can find a cut-free d*+ I with |d*| <2! if 
p(d)<w, and |d*|<«({d|) if p(d)=. 


3.5. Cut-Elimination for Z.+ Prog 


We add to Z.. the following progression rule: 


Grex foralli<j f losed 
Prog rsex or s a closed term 


with value j. 

The p.f. in Prog is s € X, and the m/f. in the i-th premiss is 7 € X. Now all 
the definitions, lemmas and proofs of Section 3.4 carry over almost word 
for word. Only part of the proof of the Reduction Lemma must be 
extended slightly: So let » be a p.f. in the last inference of do, 1 9 bea pf. 
in the last inference of d; and g be atomic. Let further the last inference in 
dy) be Prog and in d, be A. We can assume that ¢ is a s.f. in the last 
inference of dy; hence it has the form 


PsEx,tEexX foralli<j 
s a closed term 


rsE€x : ; 
with value j. 


The last (and only) inference of d, is an instance A, 4(s € X) of the rule 
A. Now it is easy to see that then either 

(i) t€ X is in A for some closed ¢t with value j, or 

(ii) A is already an instance of the rule A. 
In the latter case the result follows by weakening. In the former case we 
have by the induction hypothesis I, 4, t€ X, 1€ X for all i<j, with 
length </d,| and cut-rank 0. The result follows by an application of the 
rule Prog. 


3.6. Underivability of TI.,(X) in Z 


3.6.1. Lemma. In Z.+ Prog we can derive Prog(X) with finite length and 
cut-rank. 


We give an informal argument which can be easily transformed into a 
derivation in Z..+ Prog. 

Recall Prog(X)=Wn (Wm (m <n->m€ X)->n€X). For any i<j 
we have Vm (m <j—-me€ExX)—i7€X. Hence by the progression rule 
Vm (m<jommexX)—>j €X. Hence Prog(X) by the w-rule. 


3.6.2. Lemma. Let d be a cut-free derivation in Z.+ Prog of Ble 
X,...,'Bx! E X. Then d has length = min(B,,..., Bx). 
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This follows immediately from the form of the rules of Z..+ Prog; use 
induction on | d|. The whole derivation must consist of instances of the rule 
Prog and of improper instances of the w-rule. 


3.6.3. Now assume TI,,(X) is derivable in Z. Recall that TI.,(X)= 
Prog(X)—~ Wn (n€ X). By 3.4.2 and 3.6.1 we then have a Z..+ Prog- 
derivation of n€ X (with variable n) with length <q@-2 and finite 
cut-rank. By the Cut-Elimination Theorem for Z.+ Prog we obtain a 
cut-free derivation of n © X in Z..+ Prog with length a < &9. Hence by the 
Substitution Lemma we should also have a cut-free derivation of ‘a + 1'€ 
X in Z.+ Prog with length a. This contradicts 3.6.2. 


4. Bounds from proofs of existential theorems 


We now take up the question ‘“What more do we know if we have 
proved a theorem by restricted means than if we merely know that it is 
true?”’ As before, we restrict ourselves to arithmetic Z, where one can get a 
satisfactory answer; cf. Section 1.4 for a summary of the results. Using the 
‘terminology of Section 3.1 we can also summarize the results as follows. 
We show that a functional F of level <2 (i.e. with number and function 
arguments) can be introduced in a recursive extension of Z iff Fis < &o- 
recursive, i.e. F can be defined by the (Kleene) primitive recursive 
operations and q@-recursions for a < &o. 


4.1. < e€-recursive functionals 


A functional F of level <2 is called primitive recursive in Kleene’s sense 
iff it can be defined by means of schemata (i}(v) below. Here n= 
No,..-)Mp-1 iS a Sequence of number variables and f = fo,...,f-1 iS a 
sequence of function variables. 

(i) (Identity) F(n, f) =n; (for i <p). 

(ii) (Function application) F(n, f) = fi(n,.-.,n...) (or i<q and 
Joy .- +5 Jx-1< Pp). 

(iii) (Successor) F(n, f) =n, +1 (for i <p). 

(iv) (Substitution) F(n, f) = G(HL(n, f),..., Hi-i(n, f), Ko(-,,f),..., 
Ki-i( "y n, f)). 

(v) (Primitive recursion) F(0,m,f)=G(m,f), F(n+1,m,f)= 
H(F(n, m, f), n, m, f). 

In (iv), K;(-,",f) means AxK; (x,n,f). Note that F(n,f) is always a 
natural number. 
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Let @ be an ordinal < ¢, and let < be our natural well-ordering of order 
type £0 (cf. Section 3.2). By a-recursion we mean the following definition 
schema. 

(vi) (@-Recursion) For n <'a', 


F(n, m, f) = G(n, m, (Fl n)(-,m, f), f) 
{ mf) ifi<n, 


where 


(Ff n)(i, m, f):= 
otherwise. 
For 'a! <n, F(n,m, f):=0. 
A functional F of level =2 is called < e€o-recursive iff F can be defined 
by the primitive recursive operations (i}-(v) and @-recursions for a < &. 
The class of <eo-recursive functionals of level = 2 is denoted by Rec.,,. 


4.2. THEOREM (KREISEL [1952]). If Vn dm e(n, m) is derivable in Z with 
y(n, m) quantifier-free and without free variables other than those shown, 
then we can find a function F € Rec.,, such that Wng(n, F(n)) holds. 


4.2.1. We first sketch the proof. So let a Z-derivation of Vn dm ¢(n, m) or 
equivalently of Img(n,m) be given. As in 3.4.2 we can transform this 
Z-derivation into an infinite Z.-derivation d(n)+ 4m (n,m) with length 
|d(n)|<w -2 and finite cut-rank. Furthermore, as in 3.4.3, we can trans- 
form d(n) into a cut-free Z.-derivation d*(n)+4dmg(n,m) with length 
|d*(n)|< eo. By the Substitution Lemma in 3.4.3 we obtain for any i a 
Z..-derivation d*(7)+ Adm (i, m) also with length | d*(7)|< €o. Now from 
the form of the normal rules of Z. it is clear that d*(i) contains only 
subformulas (cf. 2.8) of 3m ¢ (i, m). We may assume that d*(7) contains no 
free variable (otherwise substitute 0 for any variable free in d*(r)). Hence 
all instances of the w-rule in d*(7) must be improper (cf. 3.4.1) and so we 
may as well cancel them. This yields a cut-free d**+ Img(t,m) which 
does not involve the w-rule. To d** we can apply the same argument as in 
the proof of Herbrand’s Theorem 2.9 and obtain closed terms So,..., S«-1 
and a derivation of (i, So),.-., (1, &-1). At least one of these formulas 
must be true. The value at the argument i of the function F we have to 
construct is to be the (say) least numerical value of some s; such that ¢ (i, s;) 
is true. 

What still remains to be shown is that this F is < ¢o-recursive. For this 
we use an “effective”? counterpart of the above construction, where we 
work with codes for Z.-derivations instead of using the Z.-derivations 
themselves. 
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4.2.2. Codes for Z.-derivations 

The codes will be natural numbers. They are defined inductively, 
corresponding to the inductive build-up of Z.-derivations. The inductive 
definition is trivial for the finite rules A, A, Vo, v1, V, 3, Cut. However, for 
the w-rule there is a difficulty since then we in general have infinitely many 
premisses. The idea now is to assume that the codes for the premisses can 
be enumerated by a primitive recursive function, and to use a code (or 
primitive recursive index) of such an enumeration function to construct a 
code of the whole derivation. Another essential point is that our codes 
should contain sufficient information about the coded derivation. In 
particular, if a number u codes a derivation d, then we want to be able to 
read off primitive recursively from u 

(i) the name of the last inference of d and its p.f., m.f. and s.f. (and 
hence its conclusion), 

(ii) a bound for the length |d|, 

(iii) a bound for the cut-rank p(d), and 

(iv) a bound for the (finite) set of variables free in d. 

The corresponding primitive recursive functions will be denoted by 
Rule(u), p.f.(u), m.f.(u), s.f.(u), End(u), |u|, Rank(u) and Var(u), 
respectively. 

We do not write out all cases of the inductive definition of the predicate 
u € Code(u is a code for a Z.-derivation), but rather give two examples 
corresponding to the rule Cut and the w-rule. 

Cut: If u, v € Code, End(u) = 'T, g', End(v) = 'T, 4 ¢! and |u|, |v| < 
a, then ('Cut!,'o!, 'T'" a, u, v) © Code. 

w-tule: If, for any i, [e](i)=:u,€Code, End(u;) = 'T,A(7t)', |u|<a, 
Rank(u;) =k and Var(u;)C* b, then ('w','A(n)', 'n', 'T'", a, k, b, eve 
Code. 

Here [e] denotes the primitive recursive function coded by e. 
denotes as usual a natural code for the finite object ---; C” corresponds 
(under the relevant coding of finite sets of variables) to C ; (xo,..., 4-1) iS a 
primitive recursive coding of finite sequences of natural numbers with 
primitive recursive inverses (x)j, i.e. ((Xo,..., X:-1))i = x: for i< 1 We also 
skip the (trivial) primitive recursive definitions of the functions Rule(u), ... 
mentioned above. 


[ested 


4.2.3. It is easy to see that all Z.-derivations obtained by embedding Z in 
Z.. (cf. 3.4.2) can be coded, and that any such code has length |u| <|w -2!. 


4.2.4. We now show that to the operations on Z.-derivations defined in 
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3.4.3 (weakening, substitution, etc.) there correspond primitive recursive 
functions on the codes. This will follow by easy applications of the 
Primitive Recursion Theorem of KLEENE [1958]. The lemmas are stated in 
the order they can be proved. We shall only sketch the proof for one of 
them (a typical example). 


WEAKENING LEMMA. We have a primitive recursive function Weak such that 
for any u © Code and any I the following holds. 
(i) Weak(u, 'T"') =: uo € Code, 
(ii) End(uo) = 'T, A! if End(u) ='A', 
(iii) [uol =|uf, 
(iv) Rank(uo) = Rank(u), and 
(v) Var(uo) = Var(u)U* V* with V the set of variables free in I. 


EVALUATION LEMMA. We have a primitive recursive function Eval such that 
for any u € Code, I'(n), variable n and closed terms s, t with the same value 
the following holds. 

(i) Eval(u, 'T(n)!, 'n!, 's', 't!) =: ue € Code, 

(ii) End(uo) = 'T'(t)' if End(u) = 'T(s)', 

(iii) | uo] = |u|, 

(iv) Rank(uo) = Rank(u), and 

(v) Var(uo) = Var(u). 


SUBSTITUTION LEMMA. We have a primitive recursive function Sub such that 
for any u € Code, variable n and term s the following holds. 
(i) Sub(u, 'n', 's'!) =: uy € Code, 
(ii) End(uo) = 'T'(s)' if End(u) = 'F(n)', 
(iii) {uol <u, 
(iv) Rank(uo) = Rank(u), and 
(v) Var(uo)C” (Var(u)—*{n}*)U* V* where V is the set of variables free 


in Ss. 


For the proof one has to construct a primitive recursive function (also by 
the Primitive Recursion Theorem) corresponding to the change of bound 
variables in Z.-derivations. 


INVERSION Lemma. (1) We have primitive recursive functions Inv; (i = 0, 1) 
such that for any u © Code and conjunction go A ¢. the following holds. 
(i) Inv,(u, ‘go A g:') =: u; ECode, 
(ii) End(u,) = 'T, g) if End(u) ='T, go ¢1! with gong, not in T, 
(iii) |u.|<|ul, 
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(iv) Rank(u;) = Rank(u), and 

(v) Var(u;) C* Var(x). 

(2) We have a primitive recursive function Inv such that for any u © Code 
and generalization Vni(n) the following holds. 

(i) Inv(u, ‘Wn (n)!) =: uo © Code, 

(ii) End(uo) = 'T, &(n)! if End(u) = 'T,Vndb(n)! with Wn(n) notin I, 

(iii) | uol <|u|, 

(iv) Rank(uo) = Rank(u), and 

(v) Var(uo) C* Var(u)U* {n}*. 


REDUCTION LEMMA. We have a primitive recursive function Red such that 
for any uo, u;© Code and formula g with Rank (u.)<|¢| (i =0,1) the 
following holds. 

(i) Red(uo, us, 'p') =: u€ Code, 

(ii) End(u) = 'V, A! if End(uc) = 'T, g! with @ not in and End(u,) = 
‘A, ¢@! with mo not in A, 

(iii) Rank(u)=|¢|, 

(iv) |u| <'& # &! if |u| ='&!, and 

(v) Var(u) C* Var(uo) U* Var(u,). 


CuT-ELIMINATION THEOREM. We have a primitive recursive function Elim 
such that for any u € Code with Rank(u) =k +1 the following holds. 
(i) Elim(u) =: u’ € Code, 
(ii) End(u’) = End(u), 
(iii) |u’| <2"*' with 'é':=|ul, 
(iv) Rank(u’) =k, and 
(v) Var(u’) C* Var(u). 


Proor. By the Primitive Recursion Theorem we can define a primitive 
recursive function Elim with code e as follows. 

Case 1. Rule(u) = 'Cut'. Let m.f.(u) = {9, a¢}”. 

Subcase 1.1. |¢|+1<Rank(u). Define Elim(u) = ((u)o, (u):, (u)2, '2", 
Elim((u)s), Elim((u);)) where '€! = (u)s. 

Subcase 1.2. |g|+1=Rank(u). Define Elim(u) = Red(Elim((u),), 
Elim((u)s), '¢'). 

Case 2. Rule(u) ='w!'. Define Elim(u) = ((u)o,..., (us, '2°!, k, (uo, €') 
where 'é! = (u), and e’ = e'(e, u) is a code of Elim({(u),](n)) as a primitive 
recursive function of n; e’ as a function of e and u is primitive recursive. 
The other cases are treated similarly. By <-induction on |u| one can prove 
easily that Elim(w) has the required properties. 1 
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4.2.5. Now we prove Theorem 4.2, following the sketch in 4.2.1. So 
let a Z-derivation of mg(n,m) be given and let u be a code of 
the corresponding Z.-derivation (cf. 4.2.3). Hence |u|] ='€'<'w-2!, 
By a finite number of applications of the Cut-Elimination Theorem in 4.2.4 
we obtain a code u* of a cut-free Z.-derivation of Img(n,m) with 
[u*|< '2b.nu). Then Sub(u*, 'n!, '7') is a code for a cut-free Z..-derivation 
of Img(i,m). We may assume Var(Sub(u*, ‘n!','7')) = 6" (otherwise 
apply Sub(-,'m!','0') for any m &* Var(Sub(u*, ‘n','7'))). Hence the 
Z.-derivation coded by Sub(u*, 'n', '7') contains only improper instances 
of the w-rule, which may be cancelled. However, the function Fo corre- 
sponding for codes to this cancellation is not primitive recursive, but 
only <e,-recursive: in case Rule(v)='w! we have to define Fo(v) = 
F,([(v)](0)) and we only know |[(v)](0)|<|v/. Now from 
F,(Sub(u*, 'n!', '7')) we can easily read off primitive recursively all (closed) 


terms So,...,S:-, uSed in instances of the rule 3 in the corresponding 
derivation. Since by the same argument as in the proof of Herbrand’s 
Theorem 2.9 we get a derivation of ¢(i,50),...,@(i, %-1), we know 


that at least one ¢(i7,s,) must be true. Let F(i) be the least numerical 
value of some s, such that ¢(i,5s,;) is true. This completes the proof of 
Theorem 4.2. ( 


4.3. We now turn to a generalization of Theorem 4.2 to arbitrary 
Z-formulas. For the formulation of the result we need the notion of the 
Herbrand normal form gu of a formula g, which we introduce first. 

The genera! definition of g, is sufficiently explained by the following 
example. Let ge =AnVm Sk Vpd(n, m, k, p). Then ou= 
Jn, k b(n, f(n), k, g(n, k)) with function variables f, g. One can show easily 
that ~ — on is derivable (logically and hence) in Z, and furthermore that if 
¢u is derivable in Z then so is g (cf. SHOENFIELD [1967]. 

In general, for an arbitrary prenex formula g the Herbrand normal form 
gu is obtained from ¢ by (i) dropping all universal quantifiers in the prefix 
of », and (ii) replacing any variable m bound by a universal quantifier in @ 
by f(n), where n are all variables preceding m in the prefix of g and bound 
by existential quantifiers, and f is a new function variable. Hence gu has 
the form dmg" with g" quantifier-free and generally containing new 
function variables. Again ¢ — x is derivable (logically and hence) in Z, 
and if gy is derivable in Z then so is g. 


4.4. THEOREM (Kreiset [1952]). Let @ be a formula without set variables 
derivable in Z. Let p14 =Amo"(f, n,m) be its Herbrand normal form. Then 
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we can find < e€y-recursive functionals G such that for all functions F and 
numbers i, o"(F, i, G(F,i)) holds. 


Proor. For simplicity assume gu = Im o"(f, n,m) with g" quantifier-free 
and without free variables other than those shown. Since by assumption ¢ 
is derivable in Z, we know that also gu is derivable in Z. We have to 
construct a function G € Rec.,, such that for any function F and number i, 
go" (F, i, G(F,i)) holds. 

The proof is completely parallel to the proof of Theorem 4.2; we only 
have to relativize it to a given function F. 

We first introduce a relativization Z.(F) of Z.. The language of Z.(F) is 
the language of Z without set variables and with just one distinguished 
function variable f. A finite set A of formulas is called a Z.(F)-axiom if A 
consists of atomic or negated atomic formulas without number variables 
such that V A is a tautological consequence of substitution instances of the 
quantifier-free axioms of Z and the additional axioms f(j) = k for all j, k 
such that F(j)=k. The rules of Z.(F) are the same as the rules for Z.. 

The treatment of cut-elimination for Z.. in Section 3.4 carries over nearly 
unchanged to Z.(F). Just note, for the Evaluation Lemma, that any term 
s(f) without number variables has a numerical value i under the assign- 
ment f + F, and s(f) = 7 is a Z.(F)-axiom. Now the proof of Theorem 4.2 
can be adapted almost word for word, with the following exceptions. 

(1) In the definition of codes for Z.(F)-derivations we replace [e] (i) by 
[e](F, i); [e] is now the e-th primitive recursive functional (in the sense of 
Kleene). 

(2) The functions Weak, Eval, Sub, Inv,, Inv, Red, Elim, Fy are to be 
replaced by functionals with F as an additional argument. This completes 
the proof of Theorem 4.4. O 


4.5. We now state a converse to Theorem 4.4 (and hence also to Theorem 
4.2) and sketch its proof. 


THEOREM. Let F bea < €o-recursive functional. Then F can be introduced in 
a recursive extension of Z. 


4.5.1. For the proof we need an auxiliary notion: the modulus of continuity 
of a functional F. We now introduce this notion. 

First note that any < eo-recursive functional F(n, f) is continuous in the 
sense that it depends only on a finite part of any of its function arguments. 
Or equivalently, F is continuous for the discrete topology of N and the 
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corresponding product topology on product spaces. This can be seen easily 
by induction over the build-up of < e -recursive functionals. A functional 
M,; is called a modulus of continuity for F iff for any n, f, Mr(n, f) codes a 
finite set S of natural numbers such that for any two tuples of functions f 
and f’ coinciding on U,S* we have F(n, f) = F(n, f’). 


4.5.2. We shall prove the following extension of Theorem 4.5. 


THEOREM. Let F be a < €-recursive functional. Then we can construct a 
< €-recursive modulus of continuity M, for F, and F as well as Mr can be 
introduced in a recursive extension of Z. 


Remark. The fact that any < e,-recursive functional F has a < &o- 
recursive modulus of continuity was first proved by Kreisel in lectures 
(71/72); other proofs are in TRoELsTRA [1973] and in SCHWICHTENBERG 
[1973]. 


The proof is by induction on the build-up of < eo-recursive functionals. 
We only treat the case of a-recursion, the other cases being simpler or 
trivial. So let 


F(n, m,f)= G(n, m,(F[n)(-,m, f), f). 


By the induction hypothesis we can assume that G and a modulus of 
continuity Mg of G have been introduced (in a recursive extension of Z). 

We first show how F can be introduced. The trick is not to introduce F 
directly, but via another functional which assigns to any argument n, m, fa 
computation u of F at this argument. Here u is called a computation of F 
at n,m, f iff the following holds. 

(i) u is a finite function with domain {ao, ai,..., @-1} where ao < a; < 
Sa EN. 

(ii) u(a,) = G(a, m,(ufa,), f) for i< k, where (u[a;) is defined by 


u(x) ifx=a, for some j <i, 
(ula)(x) = 
0 otherwise. 

(iii) Mc (a; m, (ut a;), f) N {x |x <a;}C{ao,...,a-:} for i<k. Note 
that any of the conditions (i}-(iii) is quantifier-free and does not involve F. 
Now one can prove in Z Vn, m, f du (u is a computation of F at n, m, f), by 
a-induction on n. To see this, observe that adding to Z the arithmetical 
axiom of choice and second-order logic (but no second-order instances of 
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the induction schema) gives a conservative extension of Z; this can either 
be proved directly using the method indicated in Chapter D.4, 5.5.1, 
or else follows from the much stronger result in Chapter D.4, 8.7. 
Hence the corresponding functional giving u as a functional of n, m, f can 
be introduced, and from this F can be easily defined explicitly. Further- 
more, from the conditions (i)}-(iii) and the fact that Mc is a modulus of 
continuity for G one can prove (in Z) the defining equations of F. 
Now M, can be defined from F by the following a@-recursion 


Mz (n, m, f)= S*U*" J * Me(a, m, f), 


aes 
where 


Ss. = Me (n, m,(F[n)(-, m, f), f). 


Hence, by the argument just given, Mr too can be introduced. By 
a-induction on n one can show in Z that M, is a modulus of continuity for 
F, using the defining equations for F and the fact that Mc is a modulus of 
continuity for G. 


5. Transfinite induction and the reflection principle 


5.1. We now consider Z without set and function variables. The (uniform) 
reflection principle for Z is the schema 


RP Der(x, 'g(n)')— o(n) 


where Der(x, y) is the primitive recursive predicate which holds iff x codes 
a Z-derivation d+g and y='g!', and 'g(n)' is a primitive recursive 
function of m and denotes a code for the formula obtained from g(n) by 
substituting the numerals a for the variables n, ie., ‘o(n)'= 
Subst('g(n)!, 'n', Num(n)) with the obvious primitive recursive functions 
Subst and Num. Furthermore, we assume that x is not free in p(n). Note 
that RP trivially implies the consistency of Z, i.e. the formula 
Vx — Der(x, '0 = 1'). The schema of transfinite induction up to e€, for Z is 


TI,, Vn (Wm (m <n ¢(m))—> e(n))>Vng(n). 


§.2. THEOREM (KREISEL and LEvy [1968]). Z together with the schema RP is 
equivalent to Z together with the schema TI... 


Pproor. We begin with the easy part and show that TI., is derivable in 
Z+RP. So let y(n) be given and define #(k) to be 
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Vn (Wm (m <n— e(m))—> o(n))> Vn (n < F(k)—> e(n)) 


where F(k) = 'w,', @o = 1, w+; = o%. Since Z+ Wm 3k m < F(k), it suffices 
to derive w(k) in Z+RP. Now from the proof of GENTZzEN [1943] (or 
ScHUtre [1960]) of transfinite induction up to «, in Z one can extract a 
primitive recursive function G such that Z+ Wk Der(G(k), '(k)'). From 
this and RP we obtain #(k), as required. 

The proof of the converse will cover the rest of this section. We have to 
show that RP is derivable in Z+TI.,. So assume Der(x, p(n)). Now the 
following lemma is derivable in Z (cf. 4.2.3 and 5.2.2): 


EMBEDDING LEMMA. We have a primitive recursive function Emb such that 
for any x, y with Der(x, y) the following holds. 
(i) Emb(x) =: u, €Code, 
(ii) End(u,) = y, and 
(iii) [u,|<'w@ -2!. 


Also the Cut-Elimination Theorem of 4.2.4 is derivable in Z + TI., (cf. 
5.2.2). Hence we can prove in Z+TI., that we have a u*€ Code 
(depending primitive recursively on x) with End(u*t)='g(a)' and 
Rank(u*) = 0. In 5.2.1 we shall give within Z a partial truth definition Tr, 
with the following characteristic property: For any formula y(n) with 
depth of quantifier-nesting QD(w(n))=q one can prove in Z 


Tr, ('h(n)') > p(n). 


Now the following lemma obviously holds (use < -induction on |u|) and is 
derivable in Z+ TI,, (cf. 5.2.2): 


TruTH Lemma. For any u € Code with Rank(u)=0 and End(u)='p! 
where QD() = q we have Tr,(''). 


Specializing this to u = u* we obtain Tr, ('g(n)') and hence y(n), both 
in Z+ TI,,. 


5.2.1. We define for any q =0 a set Tr, which is intended to give a partial 
truth definition for all Z-formulas g with depth of quantifier-nesting 
QD(¢) = q. 

First note that we can easily introduce a function Val (in a recursive 
extension of Z) such that for any term s(n) Val('s(n)!) = s(m) is derivable 
in Z. 
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DEFINITION. Tr, is defined as follows. 
(i) Tr, ('Pso(n) +++ s,-:(r)!) <> P(Val('so(n)'), ..., Wal('s,-:(1t)')) for any 
predicate or set symbol P. 
(it) Tr, ("po A G1!) <> Tq ("go') a Tr, ("g:') if OD(¢;) <= g. Similarly for v. 
(iii) Tr, (‘Wng(n)')oVn Tr,-('¢(n)') if ¢g =1 and QD(e(n)) <q — 1. 
Similarly for J. 


Lemma. Tr, ('¢(#)')<> ¢(n) is derivable in Z if QD(e(n)) $4. 
The proof is obvious, using induction on |¢(n)|. 


5.2.2. We now show that the Embedding Lemma and the Truth Lemma 
stated in Section 5.2 as well as all the lemmas in 4.2.4 up to and including 
the Cut-Elimination Theorem are derivable in Z+ TI,,. The only point to 
verify is that all these lemmas can be formulated in the language of Z; the 
formalization of the proofs is then routine. Now the only possible obstacle 
against such a formulation is the occurrence of the inductively defined 
notion of a code for a Z.-derivation (cf. 4.2.2) in all these lemmas. We now 
show how this notion can be represented in purely generalized form. 

Infinite Z.-derivations may be considered as well-founded trees, where 
at each node there is either no branching at all (i.e. it is a bottommost node) 
and an instance of the rule A is affixed, or there is a 1-fold branching 
(corresponding to the rules v,, V J), or a 2-fold branching (corresponding 
to the rules a, Cut), or an w-fold branching (corresponding to the w-rule). 
Then any code u of a Z.-derivation d can be thought of as obtained 
inductively by affixing to each node of the tree corresponding to d a code 
of the corresponding subderivation. Hence the property u € Code is 
equivalent to u having such a well-founded genealogic tree. But the latter 
fact can be easily written in purely generalized form: One has to express 
that at any node ( = sequence number) n the tree is locally correct, i.e. that 
the code u, affixed there (u, can be easily defined by induction on n) and 
all its predecessors uni, i =0,1,2,..., fulfill a relation as given in the 
definition of codes for Z.-derivations. The well-foundedness is then 
obtained automatically, since in particular |u,-)|< |u| is required and < 
is a well-ordering. O 
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1. Introduction 


1.1. To explain the results and methods characteristic of the theory of 
proofs we reconsider here a special case of the following corollary to the 
Completeness Theorem. 


THEOREM. If G(x.,...,%n) is a quantifier-free formula of predicate logic and 
Ax.+--x,G(x1,...,%) is valid, then there are terms t;, for 1S i=m and 
1=j <n, such that VisismG(ti,..., tn) is valid. 


This result is known in the literature as Herbrand’s Theorem; see e.g. 
Theorem 2.9 of Chapter D.2. 


1.2. There is a recursive method of passing from any derivation of 
Jx,:-+x,G(x1,...,X,) in the predicate calculus to the list of terms and a 
derivation of VicicmG(ti,..., tn). The existence of this method, and of a 
partial recursive one independent of the derivation, follows from the. 
soundness and completeness of the rules. The non-existence of a recursive 
method depending only on the formula follows from the recursive unde- 
cidability of validity. 


1.3. The theory of proofs uses these general observations for orientation 
and makes the following types of distinctions in order to get more 
rewarding results: 

(i) Not arbitrary quantifier-free G are considered but rather certain 
subclasses. For example, in the language of semi-groups, we pick out those 
formulae which naturally express that two words (terms) are equal in a 
certain finitely generated, finitely presented cancellation semi-group. More 
generally, if Ai,...,A, are fixed prime formulae, and [,,...,J, fixed 
implicational formulae with I, = Pi —>(--- (Pig, C’)--+) for P/ and C’ 
prime, we consider the collection L of formulae G of the form 
(Aisisp Ai A Aisjaq]) A Aicx<m Bi )— Busi for B, prime and not containing 
Xie Xn 

(ii) Not arbitrary sound complete rules are considered but rather rules 
adapted to the mathematical content of L. In the above example, for 
relations R,--- R,,, if w; = w2 then w, = w; is derivable from R,--- Rn, the 
associativity axioms ((ab)c) = (a(bc)), and the identity axioms a = a by 
the rule of substituting equals for equals together with the cancellation laws 


ab) = (ac (ba) = (ca) 
and . 
b=c b=c 
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More generally, if 4x,---x,G(x.,...,x,) is valid, for G in L, then B,,.; is 
derivable from B,--- B,, together with the axioms 0A; for each substitu- 
tion @ for x,---x,, by the rules 
OP} +++ OP ii) 
ac’ 
for each substitution @ for x,---x, (this is a simple consequence of 
Gentzen’s Cut-Elimination Theorem). 

(iii) Not only the complexity of the terms ¢; is considered but also how 
the length of a derivation is related to the choice of the corresponding f;. In 
particular, the following questions are raised: 

(a) Which substitutions 6 are needed to derive B,,.; from B,-:: B,,? 

(b) How is the length of a derivation of B,,., from B,--- B,, related to 
the complexity of the corresponding 6’s? 

(c) What is an efficient order of application of the rules? 

It should be noted that (a), (b) and (c) are by no means trivial even if 
validity of formulae in L is recursively decidable; proof-theoretic methods 
are used to study the efficiency of decision procedures. 


1.4. A principal aim of the theory of proofs is to make differences among 
proofs, previously judged only by ‘“‘aesthetic criteria of elegance or 
convenience’’, objects of study. One such difference, which has received a 
great deal of attention, involves ‘“‘directness” where for our purposes a 
direct proof is one which contains no term more complicated than all those 
in the theorem proved. If direct proofs are formalizable in the above 
system of rules for L then this fact already provides an answer to question 
(a). In particular, such direct proofs may be the values of a complete 
proof-search procedure of a kind similar to the so-called semantic tableaus. 

In 1961 Kreisel and Tait gave an equation calculus intended to formalize 
the notion of a computation from equations which define the values of a 
number-theoretic function. This calculus K is related in the manner 
sketched above to a set J of axioms (for an elegant formulation see KREISEL 
and KrivinE [1967], Exercise 8, p. 48), and, using this relation, Kreisel and 
Tait showed that K is complete in the sense that, for equations E,, if 
(E,a-+--AE,)— Ems: is valid then E,,.: is derivable from E,--- E,,. We 
shall try to answer questions (a), (b) and (c) for K; one thing we will prove 
is that direct ‘“‘computations” are formalizable in K. 


1.5. Kreisel and Tait actually gave two proofs of completeness, both of 
which are instructive. The first proof is the above-mentioned model- 
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theoretic one; consider the two-sorted first-order language corresponding 
to the structures (N, F, = ,0,s5), where N is the set of natural numbers, and 
F is a collection of number-theoretic functions containing the function 
constantly 0, the successor function s, and closed under explicit definition 
(lambda-abstraction). Let J be the following collection of axioms: 

(1) x =x, 

(2) (fx = gx ax =y)— fy = gy, 

(3) (fx = gx ny =x) fy = gy, 

(4) sx =sy>x=y, 

(5) 0O= sx > y =z, 

(6,) x =s"x > y =z, 
where s'= 5 and.s**'= ss*. The members of J are clearly valid over the 
collection of structures (N, F, = ,0, s). In addition, if one considers general 
models of J, if (FE: A-+: A E,n)— En+i is not valid, then it has a counter- 
model of the form (N, F, = ,0, s). In particular, the function variables in the 
E; can be interpreted as number-theoretic functions with finite support; 
thus, we have decidability as well as completeness for K. 


1.6. The second proof goes by way of formalizing in K a particularly 
elementary decision procedure for the validity of (E, 0--* A En)— Ens. 
This formalization results in a complete proof-search procedure, and 
analysis of the proof provides answers to questions (a) and (c), namely, 

(a) if (E,a-::A E,)— Ens: is valid and E,a---a E,, is satisfiable, then 
there is a direct computation of E,,.: from E,--- E,, of length = 2“’, where 
p is the number of occurrences of non-logical symbols in 
(Ei: a+++ A En) Ens (if Ei a+++a£E,, is not satisfiable, then a similar 
result is true, but not necessarily for E,,.:), and 

(c) the validity of (E,a:::A E,)— En+: can be decided by a deter- 
ministic Turing machine in polynomial time (a polynomial number of 
operations in the length of a code of the formula). 

Here it should be noted that the bound 2“ is for computations as trees of 
equation occurrences; by identifying different occurrences of the same 
equation, these can be coded as sequences of equations with a resulting 
bound of kp’. 

We present the second proof of Kreisel and Tait. 


1.7. In order to give an answer to question (b) we will analyze computa- 
tions in K in a manner similar to Gentzen’s cut-elimination argument. In 
particular, we will show that if the individual inferences of a given 
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computation are ‘‘fully analyzed’’, then these inferences can be permuted 
so as to obtain a direct computation. More precisely, we will prove that 

(b) if in a computation T of E,,., from E,--:E,, 

(i) the rules corresponding to the axioms (5) and (6, ) are not used at all, 
and 

(ti) each application of the rule of substituting equals for equals (corre- 
sponding to axioms (2) and (3)) substitutes for at most one occurrence of 
one of the equals, 
then there is a direct computation of E,,,, from E,--- E,, of length < kp’, 
where p is the length of T. 


2. Preliminaries 


2.1. We consider equations E between individual terms a, b,c,..., possi- 
bly containing function variables, and finite sets of equations S. 


2.2. Notation. We write [---](ai/xi,...,@n/X,) for the operation of 
simultaneously substituting a, for x; in [---]; substitutions @ may substi- 
tute function terms Ax,--+x,a for n-place function variables f under the 
definitions 6(fa,--- an) =(Of)(@a:):*:(@a,) and (Ax,-++xna)b,°-++b, = 
a (bi/x1,..., bn/Xn). 


2.3. DeFIniTION. A function M from terms to non-negative integers is 
called a measure if M(a)=M(b)> M(c (a/x))= M(c (b/x)) and, 
whenever x occurs in c, M(a)= M(c (a/x)). 


If M is a measure we set M(a=b)=aM(a)+M(b), and 
M(S) =a 2ees M(E). 

Length is a measure, where lh(a)=athe number of occurrences of 
symbols in a. 


2.4. DEFINITION. A substitution @ is called non-projecting if whenever 
Of = Ax,-++xX,a each x; actually occurs in a. 


Note that if M is a measure and 6 is non-projecting, then the function 
M* defined by M*(a) = M(@a) is a measure. 


2.5. DEFINITION. S is called simple if each equation in S has one of the 
forms x =0, x, = X2, X1 = $X2, OT Xmoi = fx, -** Xin 
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2.6. DEFINITION. The calculus K of Kreisel and Tait consists of the axioms 
a =a and the rule of substituting equals for equals 


E (a/x a=b 


(1) E(b/x)”’ 

where a= b is, ambiguously, a = b and b = a, together with the rules 
2) Be 

3) oa sa 


The calculus consisting of the rule (1) only will be called H (and 
corresponds to axioms (1), (2) and (3) of J). 


2.7, DEFINITION. Computations T in K or H are binary trees of equation 
occurrences built up from assumptions and axioms according to the rules; 
it should be clear what it means for T to be a K or H computation of E 
from S. 


2.8. DEFINITION. The length of a computation T is the number of occur- 
rences of equations in T. 


2.8. DEFINITION. T is called singular if each of its inferences of the form (1) 
has x occurring exactly once in E. 


2.10. DEFINITION. Given a computation T, a switch is a replacement of a 
subcomputation 


T, Tz T, T. 


E(a/x,bly) _a*c T 4 E(alx, bly) bed T, 
E (c/x, b/ b=d y E (a/x, d/ a=c 
E (c/x,d/y) E (c/x, d/y) 


and a shift is a replacement of a subcomputation 


T, Tr T2 Ts 
E (a/x) a= c(b/y) Ts a=c(bly) b=d 
by a 


T, 
E (c (b/y)/x) b=ad E (a/x) a=c(d/y) 
E (c (d/y)/x) E (c (d/y)/x) 
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or 
T, T: Ts Tr 


E (b (a/x)/y) a=c T, b T, b gis =d c=a 
E(b(c/x)ly) _b(c/x)=d 9% E(b(a/x)/y) b(a/x)=d 
E(d/y) E(d/y) 
A move is a switch or a shift, and we say that T; reduces to T2 if T2 results 
from T, by a sequence of moves. 


2.11. Note that if T, is singular, and 7, reduces to T, or T2 reduces to Ti, 
then 
(i) T2 is singular, and in addition, 

(ii) each combination of two inferences in T, nested to the left is 
covered by a clause in the definition of a move, and 

(iii) each combination of two inferences in T, nested to the right arises 
from some clause in the definition of shift. 

Further observe that if T is singular then: 

(a) there isa Tz such that T reduces to Tp and each sequence of moves 
beginning with Tz is a sequence of switches, and 

(b) there isa T, such that T, reduces to T and each sequences of moves 
ending in T, is a sequence of switches. 

This can be seen by measuring the ‘‘leftness” of T by the n-tuple 
(ki,...,k,), where k; is the length of the i-th maximal path, from left to 
right, in T, and ordering these lexicographically. A sequence of switches 
followed by a shift, all involving inferences ‘‘on’’ the leftmost path of a 
subcomputation of 7, always decreases leftness. Tp can be obtained by 
iterating this procedure as often as possible. T, can be obtained by 
reversing it, less switches, as often as possible. 


2.12. If T is a singular H computation then each subcomputation of Tp 
has the form 
T, 


E (ai/X1,...,@n/Xn) ai = dy 


E (b,/x1,..., Qn/Xn) 


E (bilx1,..-, Qn/Xn) a, = b, 


E (b1/x1, eeey b,/Xn) 
and T, has the form 
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If T is a singular K computation, then this applies equally well to the 
maximal fragments of Tp and T, which are H computations; these 
maxima! fragments are called H fragments. 


2.13. Dertnition. If M is a measure, we say that a computation T of E 
from S is M-direct if for each term b occurring in T there is a term c 
occurring in E or S with M(b)= M(c). 


2.14. ConvENTION. Individual variables are interpreted as ranging over the 
set of natural numbers, function variables as ranging over number- 
theoretic functions, and 0 and s as denoting zero and the successor 
function, respectively. 


2.15. DerInirion. AS — E is said to be analytic if it is valid and AS is 
satisfiable. 


Of course, if AS is not satisfiable there is no reason to expect lh-direct 
computations from S to exist for all E, but rather only for some instance of 
the premiss of (3) or (4,). Take, for example, 


Sn =a {X1 = SXn, X2 = SX1,..., Xn = $X,-1} and E=40= 50. 


With this in mind, we will restrict our attention to K less the rules (3) and 
(4,) continuing to call this K. K is complete in the sense that if AS > E is 
analytic, then E is derivable from S. 


3. Direct computations 


3.1. PRoposiTION. Suppose that S is simple and M is a measure; then 

(i) if AS—x =y is analytic, then there is an M-direct computation of 
x =y from S of length <2°*, where k is the number of variables occurring in 
S, and 

(ii) if AS is not satisfiable, then there is an M-direct computation of 
x = s?x (0= sx) from S of length =2*, where p =card(S), k is the number 
of variables occurring in S, and x occurs in S. 


Proor. We proceed inductively, constructing at stage n a collection C, of 
computations of members of S, from S, and a computation T,. The 
members of C, are used as auxiliaries in constructing T,,., which has the 
form 
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where each T* is a computation from S. The construction must stop at 
some ny =the number of variables occurring in S. 

Stage 1: Set Si=aSU{x =x: x occurs in S}, Cy=aSi, x1 =arX, 
yi=ay, and T,=ax = y. 

Stage n+1: Case 1. There are a = c and b =c in S, such that a is not 
b and M(a)= M(b). Select computations T, and T, of a =c and b =c, 
resp., from C,; set Sn+i1=at Sn (@/B), Xn+1 =ar Xn (A/D), Yn+i =at Yn (@/b), 


Ti Ta 


T rene ; : 
Cast <0 arb: T is a computation of E from S in c} 


and 


Th Th 


df : 
Xn+1 = Yast 


Tr+1 = 


Case 2. Not case 1 but there are c = sa and c = sb in S, such that a is 
not b and M(a)= M(b). Select computations T, and T, of c = sa and 
c = sb, resp., from C,; set Sn+1 =ar Sn (@/b), Xnv1 =a Xn (a/b), Ynsi =ar (a/b), 


T; Ts 
* — 
Ta =a sb=sa ’ 
b=a 
Casi = reo T is acomputation of E from S in a 
n+1 df E (a/b) : 1 pu n 
and 
T,, T* 
eae . 


First suppose that AS —x = y is analytic, then S,, has the following 


properties: 
(i) If fz:+++Zm occurs in S,,, then there is exactly one z such that 
z = fz,+++Zm belongs to S,,. 


(ii) If sz, occurs in S,,, then there is exactly one z such that z = sz, 
belongs to S,,,. 
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(iii) If 0 occurs in S,,, then there is exactly one z such that z = 0 belongs 
to S,,,. 

(iv) If z, = z2 belongs to S,,, then z, is 22. 

(v) If z =sz, and z = sz, belong to S,,, then 2, is Z2. 

(vi) There are no z, and z, such that z, = sz, and z,; = 0 belong to S,,.. 

(vii) There are no z,°-:Z» Such that Z; = SZm, Z2 = SZ1,...,%m = SZm-1 
belong to S,,. 
Thus an interpretation satisfying S,, can be read off from S,, and expanded 
to an interpretation satisfying S; in particular, this interpretation does not 
satisfy X= Yn unless X,, iS Yn. Now it can easily be seen from the structure 
of T,,, that AS > (x1 = yi Xn = Yun), HENCE Xwy IS Yn. SO we may take 


Xno = Yro Dinast 
Xno-1 = Yro-t 
X2= y2 Tt 
xi=yi 


for the desired M-direct computation of x = y from S; its length is = 2*, 
where k is the number of variables occurring in S, since the maximum 
length of a path in T7 is =3i. 

Now suppose that AS is not satisfiable, then, since S,, satisfies (i) to (v), 
either (vi) (with z, different from z2) or (vii) fails. In the former case the 
desired computation is obvious. In the latter case, choose z, so that M(z;) 
is maximum, then for each q=m we have M(s‘z,;)=(M(s™z;), and a 
suitable computation of z; = sz, from S can easily be constructed from 
those in C,. O 


3.2. ProposiTION. If M is a measure, then 

(i) if AS—a = bis analytic there is an M-direct computation of a = b 
from S of length <2°™""¢""), and 

(ii) if AS is not satisfiable there is an M-direct computation of a = s’a 
(0 = sa) from S of length =2”" ©, where p <\h(S) and a occurs in S. 


ProoF. (ii) is just like (i) using the corresponding case of Proposition 3.1, 
so we only do (i). To each occurrence t of a term in AS ~ a = Bb assign a 
new variable x,; let 


Si =ae{%, = 0, X= Y, Xs = SXtyy X ftaetm = fg %,,,: Hy an occurrence 
of 0, f an occurrence of y, st; and ft,---t,, occurrences} 


and set 
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S2=a{Xy = Xp! ti = b belongs to S}. 


Now apply proposition to A(S,US2:)—>x. = 4%, for the measure M* 
defined by M*(c)= M(c(...,t/x,...)). O 


3.3.1. DEFINITION. We say that a computation T of E from S has the weak 
subterm property if each inference in T of the form (2) has sa and sb 
occurring in S. T has the subterm property if it has the weak subterm 
property and each of its axioms a = a has a occurring in E or S. Singular 
computations with the subterm property are related to M-direct ones by 
reducibility. 


3.3.2. PRoposiTION. If M is a measure and T is a singular computation with 
the subterm property, then T reduces to an M-direct Ty. 


Proor. By applying a sequence of switches to Tg construct a Ty, so that it 
contains no subcomputation of the form 


T, Ti 
E (a/x,c/ a=b T; 
E (b/x,c/y) c=d 
E (b/x, d/y) 


with M(a)< M(b) and M(d)=M(c). Since Ty, is singular with the 
subterm property, it follows easily by induction over Ty that it is 
M-direct. 0 


3.4. Proposition. If T is a singular computation of E from S with the weak 
subterm property, then there is a singular computation Ts of E from S with the 
subterm property of length =th(T). 


Proor. Construct from T,, by a sequence of switches, a computation 
containing no H fragment with an inference of the from 


Qa: = by (€1/x1) c= d, 
a,= b, (d,/x,) 


preceding one of the from 


a2 (d2/Xx2) = b, 


Each H fragment of this computation has the form 
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ap _ b, E, 
ap-1 = b, 
a,>= b, E,-1 
a,= b, E, 
a,= bg-1 
a, = b, E,+q-2 
a,=b, 


If a, and b, occur in S$ (or in E) then nothing need be done to this 
fragment; otherwise, a, is b, and this fragment should be replaced by 


a,=ai E,-1 
a, = a2 
a, = Apy-1 E, 
a, =a, E, 
Qa, = Dg-1 


Finally, obtain T; by deleting inferences of the form 


a=c b=b 


a=c 


from the resulting computation. ([ 


3.5. Proposition. If T is a singular computation of E from S, then there is a 
singular computation Tw of E from S with the weak subterm property of 
length <(Ih(T)) + Ih(T)+ 1. 


Proor. Let Ih(T)=n. We proceed inductively over larger and larger 
subcomputations of T which end in inferences of the form (2), replacing 
these by new computations (however, we will continue to refer to the 
computation at stage k as ‘‘T’’). At each stage we consider the H fragment 
containing the premiss of one such inference; if the corresponding subcom- 
putation of the original T has length p then this fragment has at most 
length 2p. The fragment is converted into one of at most length p + 2 and 
then into a K computation of at most length 2p + 2, which is substituted for 
it in the larger computation (while the inference of the form (2) is deleted). 
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At most p+2 of the 2p +2 equations are ‘‘passed on” to the next H 
fragment, and one equation has been deleted altogether; the bound 
n’+n+1 follows easily. 

Stage 1: Replace each maximal subcomputation of the form 


sia =s‘q 
st'a=st'a 


Sa = Sa 


by a=a. 

Stage k +1: Select a minimal subcomputation of T ending in an 
inference of the form (2) which has not yet been considered, and consider 
the H fragment containing sa = sb. If this fragment is just sa = sb, then do 
nothing (since we are at stage k +1, the subcomputation ends in two 
successive inferences of the form (2) and sa and sb occur in S since ssa and 
ssb do). Otherwise, by the transformations of Proposition 3.4, obtain from 
this fragment an H computation of the form 


Sa = ty Eo 
Sa=t, 


Sa = tn E., 
SA = tn+t 


> 


where ti iS Sa, tn+1 is sb, t, is not t.,, and by induction hypothesis each term 
occurring in the E; occurs in S (if in the second step of the transformation 
of Proposition 3.4 a, and b, occur in S or E, then the inference 


a, = a Ap = 04 
a,=b, 


should be inserted at the appropriate place). We distinguish two cases: 
(i) each ¢; has the form sr, and 
(ii) otherwise. 
We treat case (ii) first, reducing it to case (i) for a smaller computation. 
Case (ii). Select those pairs i, j such that 
(a) G-1 iS Sr-1, 
(b) ¢ is not sr, and 
(c) j is the smallest number larger than i such that f is again s7;,; 
then for each such pair E,_, is t-. = t, Ej-1 is §-1 = §, and 4, and 4 occur in 
S. Replace each segment 
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Sa = t,-1 Ej=6 
Sa =f; 
Sa = h-1 haixt 
sa=t 
by 
ti=F E; 
G1 = bis 
&-1 = &-1 E,-1 
Sa = t-1 ti=t 
sa=f 


and consider the fragment obtained by replacing each such replacement by 
its last inference 


Sat;-; 51= i 
sa = 5 


under case (i). 
Case (i). Replace sa =¢; by a =r, throughout the given H computa- 
tion, simultaneously replacing each inference of the form 


Sa =t, G = ba 
Sa = tis 
by 
SH STi+1 
a=T R=fier 7 
Qa = Fis 


4. Conclusion 


4.1. The fact that direct computations are formalizable in K has many 
consequences. Just to mention one, the rules of K are the most efficient 
possible (up to a constant factor when computations are coded as se- 
quences of equations) of any finite set of ‘‘analytic” rules, both with respect 
to the length of computations and the length of terms. We leave most of the 
details to the reader; the principal fact needed is the following 


4.1.1. Proposition. If M is a measure and E is derivable from S, then there 
are substitution instances E,,S,;...;E,,S, of E and S, and singular 
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computations T, of E, from S, satisfying: for each substitution 6 there is a 
non-projecting 0; and a computation T, such that 
(i) OE = GE, 
(ii) 0S = 6S, 
(iii) T; reduces to T,, and 
(iv) 0,7) is M-direct. 


Proor. Given E and S, by anticipating all possible ‘“‘projections’”’, one can 
find substitution instances F,,S,;...; E,,S, such that 
(a) if E is derivable from S then for each i so is E, from S,, and 
(b) for each @ there is an i and a non-projecting 6, with (i) and (ii). 
Suppose that E is derivable from S and choose T;, a singular computa- 
tion of E; from S; with the subterm property. Given 6 choose i and 6, as 
above; let M*(a) =a M(@a), and apply Proposition 3.3 to T, with respect 
to M* in order to obtain Ty. 0 


4.2. Our analysis of singular computations does not, as it stands, settle the 
relationship of arbitrary computations to direct ones. In particular, it does 
not answer the following 


Question. Is there an infinite sequence of pairs E,, S, and a number n such 
that there is a computation of E; from S;, of length =n, but any direct 
computation of E; from S; has length at least i? 


However, using our analysis, one could give a negative answer to this 
question if one could prove the following 


Statement. There is a function f such that if T is a computation of E from 
S then there are E*, S*, @*, and T* satisfying 
(i) 6* is non-projecting, 
(ii) O*E*=E, 
(iii) O*S*=S, 
(iv) T* is a singular computation of E* from S*, and 


(v) Ih(T*) = f(lh(T)). 


4.3. Finally, other classes of computations formalizable in K have been 
studied in connection with efficiency. For example, in STATMAN [1974], 
Chapter 2, the length of computations from axioms satisfiable by partial 
functions is compared with the length of those from ‘‘equivalent’”’ axioms 
only satisfiable by total ones. 
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1. Introduction 


1.1. Aims and interests 


This is a survey of work on formal theories related to substantial portions 
of mathematical practice. Most of current mathematics is based on 
non-constructive set-theoretical principles, but in fact strikingly little of 
what is implicit in those principles is actually used (except, of course, in set 
theory itself). For example, the bulk of mathematical analysis may be 
developed within the finite type structure over the natural numbers N and 
indeed within type level three. (The type level of N is 0, of N* and of the 
real numbers is 1, of the real functions is 2, and of operators on such is 3.) 
Transfinite types appear in set theory by transfinite iteration of the power 
set operation. But where such iteration is used at all in analysis, it is applied 
only to operations within a given type. Practice may be regarded as 
deficient in that it does not pursue the potential resources of transfinite 
types; this view is borne out by recent results concerning determinateness 
of Borel games (cf. MarTIN [1975]). Nevertheless, there is interest in a 
logical analysis of practice, for reasons which will be given in a moment; 
the restriction to low types certainly makes such a project more feasible. 

Viewed logically, the main existential principles within any given type S 
are comprehension axioms or choice axioms. The former assert that for 
each property @ of elements of S there exists the set of all objects in S 
having the property ¢. The class of properties considered may be described 
precisely within a formal language and, again quite strikingly, the defining 
properties which are actually used are of very low logical complexity (in 
several senses). This makes an informative logical analysis of practice even 
more feasible. 

On the face of it such a study should have the same kind of interest as 
familiar mathematical investigations of what kind of problems can be solved 
by given limited means such as ruler-and-compass constructions, and 
solutions by mechanical computation. There is a difference from the 
present study in that the means specified in these familiar cases were 
already understood informally in a quite clear-cut way. There is a similarity 
in that the restriction to given means of construction or solution in a certain 
area may be gratuitous or an historical accident; the study of what can be 
obtained by such means is thus also an occasion to consider whether the 
restrictions are of a deeper or intrinsic significance. 

This last is the position taken in different viewpoints as to the founda- 
tions of mathematics which call for restrictions on the methods employed 
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or completely alternative developments. We have in mind particularly the 
ideas originating with Hilbert, Brouwer, Poincaré, and the Borel-Lebesgue 
school. 

As is well known, Hilbert wished only to restrict (to a bare minimum) the 
methods which would be used to justify mathematics by consistency proofs 
of appropriate formal systems. Though this program has turned out to be 
untenable, the theory of formal systems and proofs to which it gave rise has 
turned out to be one of the principal tools of the work described below (cf. 
also Chapter D.2). 

It is not necessary to go into the ideas for constructivity introduced by 
Brouwer since that is covered fully in Chapter D.5. Certain of the formal 
theories studied there are very similar to those considered here. A 
principal difference is that we follow mathematical practice here in 
assuming classical logic. (Other significant differences will be noted below.) 
It happens that one means to approach our problems is to reduce the 
classical theories to (formally) intuitionistic theories and then exploit the 
explicit interpretability of the latter. 

The viewpoints of the French mathematicians mentioned above were 
never developed systematically by them. In this respect the logical interest 
of the present work lies in giving precise explanations of the ideas involved 
and in comparing them with practice as first tests of their viability. 

Poincaré laid emphasis on the primacy of our conception of the natural 
numbers and the indefiniteness of the concepts of function and set (and 
particularly of N‘ and R). This led him to reject the use of so-called 
impredicative definitions, such as defining a subset of N (or a real number) 
by a property @ which involves quantification over a presumed totality of 
all subsets of N (or over all of R). The logical analysis of predicativity has 
made considerable progress, and will play a role in various connections 
below.’ 

The Borel-Lebesgue school was also dominated by the idea of dealing 
only with explicitly definable sets and functions. The difference lay in 
permitting definitions built up by transfinite recursion as well as (possibly) 
accepting R as fixed. This viewpoint has not yet been explained in a precise 
way, but much light has been thrown on it by developments in recursion 
theory as well as in terms of formal theories extending those described 
below. 

Finally, logicians may find the work here to be of interest simply because 


' The terminology ‘‘predicative” is not used in the same way by all authors. We apply it to 
that part of mathematics which is ultimately reducible to our conception of the natural 


numbers (cf. 6.4.5 and 9.3 below). 
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it is a meeting ground of everyday mathematics with ideas and methods 
from proof theory, model theory, and recursion theory leading to a 
number of results which are satisfying in their own right. 


1.2. Plan of the chapter 


Much of the literature in this subject deals with systems of second order, 
including a number which were selected for study only on syntactic 
grounds. The present exposition differs by concentrating on finite-type 
theories which directly reflect logical features of practice and in which 
everyday mathematics is readily developed. One of the main logical 
problems then is to give information about the second-order content of 
these theories.” 

We begin in Section 2 with informal consideration of some closure 
conditions on universes of sets and functions. The elementary closure 
conditions lead to finite-type structures. Further closure conditions include 
quantification functionals for each domain which is regarded as fixed (or 
definite) and recursion functionals for N. In Section 3 there is a (necessarily) 
brief indication of the relationship of these closure conditions to various 
parts of mathematics. 

In Section 4 we proceed to formulate a number of finite-type theories 
based on these closure conditions. They are also recast in the logically 
more familiar form of CA (comprehension axiom ) schemes and AC (axiom 
of choice) schemes. The second-order parts of these theories include 
systems which have been studied in the literature such as Ai- CA, 
=1- AC, II}- CA, A}— CA; these are described in Section 5. In continua- 
tion, Section 6 presents principles of transfinite induction and recursion 
expressed in second-order form. This permits us to explain iteration of 
certain principles, such as for ramified analysis. 

Several recursion-theoretic models for finite-type theories are presented 
in Section 7. They are used to establish some independence results for 
mathematical’ statements relative to those parts of practice developed 
within the given theories. In particular, a number of examples are given 
using finite-type models whose second-order section consists exactly of the 
hyperarithmetic functions and sets. 

More precise information about the second-order content of the theories 
studied is obtained by various proof-theoretical methods in Section 8. 


? This approach has been developed by me in various lectures and courses over the years 
since 1970 (beginning with FEFERMAN [(1971]). As will be seen below, it is a synthesis of ideas 
and results due to many people. The plan here is derived from that of FEFERMAN [1978]. 
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These include Gentzen’s method of cut-elimination and its adaptation to 
normalization of terms, reduction to formally intuitionistic theories, and 
Gédel’s method of functional interpretation. Since these methods are 
discussed at greater length in Chapters D.2 and D.5, only special features 
of their employment for the problems here are discussed. Many of the 
results take the form that a given finite-type theory T° is a conservative 
extension of a second-order fragment T’. Among the side results described 
are characterizations of the provable well-orderings of various theories. 

Section 9 concludes with a brief indication of the literature on further 
topics. It is not within the scope of this chapter to develop finite-type 
theories of ordinals needed to reflect the Borel-Lebesgue approach. For 
this reason and as a principal illustration we have concentrated throughout 
on the predicative/impredicative distinction as it manifests itself both in 
mathematics and in formal systems. 

Sections 2 and 3 may be read without any background in logic and 
Section 4 with only moderate background. Sections 5~8 require more 
experience but not much special knowledge except for parts of the theory 
of recursive and hyperarithmetic functions (for which cf. Rocers [1967]). 


1.3. Historical note 


Theories of finite type go back to the beginnings of modern foundational 
work with Russell’s introduction (in 1908) of the (predicative) ramified 
theory of types over an unspecified set of individuals. This was elaborated in 
the Principia Mathematica where the so-called ‘‘axiom of reducibility” was 
added simply to get around difficulties of developing mathematics in the 
ramified system. As later realized by Ramsey, this made it equivalent to the 
(impredicative) simple or unramified theory of types. 

Hilbert formulated several theories of functions of finite type over N 
during the 1920’s. A third-order functional theory over N is presented in 
Hivsert and Bernays [1939] (Supplement IV). CHurcH [1940] gave a 
theory of functionals of all finite types over N in the symbolism of the typed 
A-calculus. All of these systems accepted full impredicative comprehension 
schemes as axioms. Church’s theory is equivalent in strength to Zermelo 
set theory. 

The consideration of constructively acceptable axioms for objects of 
finite type and specifically the recursion operators R is due to GODEL 
[1958]. This, together with the treatment of KLEENE [1959c] of recursion in 
quantification functionals, form the principal theoretical antecedents of the 
present work. 
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2. Closure conditions on universes and finite-type structures 


2.1. Elementary closure conditions on universes of sets and functions 


2.1.1. By a universe is meant a pair & = (Seta, Fna) of collections called, 
respectively, the sets of U and the functions of U. Elements of the sets of U 
may in certain cases themselves be sets or functions of &, but can also be 
objects of other kinds such as numbers. 


Noration. (i) A, B, C, X, Y, Z range over Seta and f, g,h range over Fnw. 
We use a,b,c,x, y,z for elements of sets not otherwise distinguished. 
Certain capital letters are used for specific functions. 

(ii) f: A > B is written when Dom(f)= A and Rng(f) C B, and fa for 
f(a) when a € A. Gr(f) denotes the graph of f considered as a subset of 
A xB. For X CA, f[X is the restriction of f to X and C,(X) is the 
characteristic function of X relative to A, i.e., C,a(X):A—{0, 1}, 
(Ca(X))a =0 8 aEX. : 

(iii) Pu (A) ={X ESeta|X CA}, (A> B)a ={f € Fna | f: A > Bh. 
When there is no ambiguity, the subscript & will be dropped, and 
(A > B)a is then denoted by (A > B) or B%. 


2.1.2. U is said to be Cartesian closed* if it satisfies the following 
conditions. 

(1) {0, 1} € Seta, 

(2) A, B € Seta > A X B, (A —B)a and Px (A) belong to Seta, 

(3) fe Fna & Gr(f) € Seta, 

(4) A €Seta and X CA > (X € Seta & Ca(X)E Fnz), 

(5) for each combination A,B,C of sets of UW the functions 
K, S, P, P:, P2, and D defined as follows are all in Fng (where a € A, 
be B, fE(A>(B-C)), g €(A—B), and i € {0, 1}: 

(i) Kab =a (the constant functions), 
(ii) Sfga = fa(ga) (substitution), 
(iti) Pab = (a,b), P'(a,b) = a, P?(a, b) = b (pairing and projections), 


a if i=0, 

(iv) Dabi = (definition by cases). 
b if i=l 

All the universes & considered in the following are assumed to be 

Cartesian closed. 


* The terminology comes from category theory; only a concrete form of the notion is used 
here. 
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NOTATION AND REMARKS. (i) f@i°-* Qn = (°° (fai)-* + )@n, i.e., association is 
to the left. We write (Ai,..., A, > B) for (Ai>-:-> A, > B), i.e., for 
A,—>(-::—2(A, — B)---). To specify domains we may use subscripts, 
e.g., Kas €(A, BA), Pan €(A, BA XB), etc. 

(ii) K,S are the basic combinators which serve to generate all functions 
explicitly definable by repeated application. The A -notation for abstraction 
may be explained in terms of these (and conversely). Thus I = I, = 
Ax € A. x is defined by I = SKK, and C = Af €(A— B)dAg © (B—>C)Ax 
EA. f(gx) by C = S(KS)K, etc.; cf. Chapter D.7. 

(iii) Conditions (3) and (4) express interchangeability or equivalence of 
sets and functions. This is accepted in set-theoretical mathematics but not 
in constructive mathematics where sets (or properties) are in general 
undecidable though all functions (constructions) are computable (cf. Chap- 
ter D.5S). 

(iv) The subsets of any given set are closed under finite Boolean 
operations, as is seen by the equivalence of sets and functions and 
definition by cases. 

(v) Ordinarily closure under the operations Aw Px_,(A) and 
A, Bw (A — B)x would be considered highly non-elementary, but that is 
only the case in the presence of comprehension principles which permit 
quantification over any set. Without such we may think of P(A) simply 
as a grouping together of all objects of the same kind or type in U, namely 
the subsets of A in % (and similarly for (A > B)a). 


2.1.3. The finite types of U over Ao,..., A, are the sets generated from 
Ao,...,;An by closing under the operations A,B» A x B, (A> B)a, 
Py (A). 

Type symbols o are used to denote members A, of U thus generated: 
given symbols yo,..., yn for the ground types, they are built up by formal 
operations 0,r» a@X7, (o—>7), [o]. Then A, =A, Acxr = Ac X As 
Awes1) = (Ae > A,), and Aje) = P(A-). 

U is said to be of finite type over Ao,..., An if each set X of U belongs to 
some Ajq}, i.e., is a subset of some Ag. 


2.1.4. Reducing the number of basic concepts; finite type structures 
The definition of Cartesian closed universe taken above was chosen as one 
most convenient and natural for mathematical applications. We can, 
however, make some simplifications for universes of finite type over given 
sets, in one of the following two ways. 

(1) Take a ground type yo with 0,1 € A,,. Restrict to the types built up 
from yo,..., yn by the operation o, t+» (a > Tr). Subsets of A. are treated 
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in terms of their characteristic functions in A,(.—,,,. To dispense with 
pairing and projection, operations of several variables are replaced by 
unary functions and repeated application (e.g., (9: (o2— 7)) takes the 
place of (a; X 0. 7)). Partial functions from (a subset B of) A, to A, are 
treated as restrictions of total functions in A,,_.,, (which may be assigned 
values arbitrarily for arguments not in B). This is the approach that we 
shall actually follow in the logical development of Section 4-8. 

(2) Types are built up only by the operations o, t > a X 7, [a]. In this 
case functions from A, to A, (partial or total) are identified with their 
graphs in Ajox:}. 


Note. Neither of these simplifications is suitable for constructive 
mathematics. For example, if f is a partial computable operation from A, 
to A, we cannot in general extend f to a total computable operation on A,. 


2.2. Closure under quantification; choice operators 
For any U and set A in %, define the function(al) 3“ : P(A )— {0, 1} by 
3“ X =0¢ X is non-empty. (1) 
Equivalently (when 0€ B), define 3* :(A > B)— (0, 1} by 


a“ f=0 6 3a € A (fa = 0), 
34f=10 WaeA (fa¥ 0). (2) 


We say that & is 3*-closed if 4“ is a function of %. 
By a choice or selection operator C for 3* we mean a functional from 
P(A) to A such that 


X CA and 3° X =05 CX EX. (3) 


2.3. Universes over the natural numbers 


2.3.1. A structure 9 is said to belong to % if all of its domains, relations, 
and functions are in %. 

Suppose the structure Jt = (N,0,’) belongs to %W. i, j,k, n, m, p range over 
N. The N-recursion operators are the functions R (or R%) defined by: 


Raf0= a, Rafn' = fn(Rafn) (1) 


for each A, aE A, fE(N,A—A) and nEN. These functionals were 
introduced in Goper [1958]. They have an impredicative or non- 
elementary character, illustrated as follows for A = [N“— NJ]; here we take 
‘g’ in place of ‘a’ and let h range over N*. 
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Rgf0h = gh, Refn'h = fn(Refn)h. (2) 


Writing Rgfn as Ah,(Rgfnh,) we see that Rgfn'h depends prima facie on 
Rgfnh, for all h, in N*. Elementary recursion operators R (or R%) may be 
defined as follows for any A =(B,,...,B,—N): 


Ref0b= gb,  Regfn'b = fn(Rgfnb)b (3) 
where b= b,,...,5,, each 6; € B, and g EA. 


Nore. The operators R are essentially those introduced by KLEENE 
[1959c]. It was shown there that the functions in (N—> N) generated by 0, 1, 
and all K,S,R are exactly the primitive recursive functions. A general 
recursive function which is not primitive recursive can be obtained using 
the operators R instead; familiar examples of such are defined by double 
recursion on N, which may be recast as a single recursion on N%. 


2.3.2. We say that % is N-closed if it contains the structure Yt and all of the 
elementary recursion operators R. U is said to be N-closed if instead it 
contains all the recursion operators R. The functions of finite type over N 
generated by 0,’, and all K, S, R form the primitive recursion functionals (in 
Gédel’s sense). From now on, all universes considered are assumed to be at 
least N-closed. 


2.3.3. In order to reflect that quantification over N is accepted as definite, 
we may demand that &% be 3%-closed.* The unbounded minimum operator w 
is a choice operator for 3%, taking as its definition: 


least n belonging to X_ if 3“ X =0, 
UX = (1) 
0 otherwise. 


Since the graph of w is definable from 3“ and < (and the latter is defined 
by primitive recursion), w is in & iff 3‘ is in %. 


2.4. Universes over the real numbers 


If the set of real numbers R and a structure on it, say R= 
(R, =e, +,°,0, 1), are considered as basic, then one may deal with universes 
which contain % and are 4*-closed. If instead R is defined in terms of N* (as 
is done below), one may assume & to be 3%~™-closed. Note that =xv is 


* 23" is often denoted by °E in the literature on recursive functionals of finite type. For 
reference below, 3™~™ is then denoted by *E. 
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definable using 3*. (We did not need to include =, in ¥ because it is 
definable by recursion.) 


2.5. Closure conditions for well-foundedness, ordinals, and inductive def- 
initions 
2.5.1. Well-founded relations 

We write a < b for (a,b)€ B when BC A?=A XA and B is transi- 
tive. The test for a relation B in A to be well-founded (relative to U) is 
given via a functional 3 |* from P(A?) into {0, 1}: 


0 if SSE (N>A) Vn [fn'<zfn], 
a/*B -| (1) 


1 otherwise. 


We emphasize that this test is relative to WU, as <s need not be 
well-founded in the absolute sense if 3 |* B=1: (NA) may be a 
proper subset of the set of all functions from N into A.’ If B is truly 
well-founded we can apply the principle of transfinite induction to any 
subset X of A (whether or not X is a %-set): 


Vx EA [Vy (y <px O VEX) DS xX EX] DS VXEA(XEX). (2) 


2.5.2. When A is N (or any other well-ordered set), we can define a 
selection operator L for the quantifier J |“ above in the sense that 


AfE(NON)Vn[fn'<sfn] and g=LB>Vn[gn'<sgn]j. (1) 


For example, g may be described as the leftmost branch through B and g 
may be defined by recursion in terms of 3 |‘. Note that testing for 
well-foundedness of B CN? is weaker than use of 3%~™. 


2.5.3. Ordinals 

These may be dealt with as equivalence classes of well-ordering rela- 
tions. However, for a more natural treatment of closure conditions 
corresponding to practice one would consider countable ordinals as a 
ground type 2, and would assume associated recursion operators R“” of 
finite type over 2,. Similarly, uncountable ordinals would be treated using 
further ground types 22, 23, etc. Because of limitations of space, we shall 
not pursue this here. 

* For example, in Section 7 we shall consider finite-type structures &% over N in which 
(NN) consists of just the hyperarithmetic functions. In this case there are examples of 


recursive B for which 3 |“ B=1 according to (1) but <, is not well-founded (with regard to 
all functions). 
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2.5.4. Inductive definitions 
Suppose given certain closure conditions on a set X C A, which may be 
put in the form 


VaE€A[C(a,xX)> aE xX] (1) 
where C is a relation between A and P(A) such that 
Cia xX)&XCYD>C(a Y). (2) 


We speak of the set S inductively generated by these closure conditions as 
the smallest set X which satisfies (1). It is familiar that S is set-theoretically 
definable in one of two ways, “from above’ as 


aES O&VXCA [VX (C(x, X) > x EX) > a EX] (3) 


or ‘‘from below” using ordinals @ and approximations S,: 
S = S., where S,, = Sa+1,anda€E$S, &C (a U Ss) for each a. (4) 
B<a 


The approach (3) implicitly involves the condition on a universe % that it 
contains 3°”? or 3‘*~™), while approach (4) implicitly involves some sort of 
ordinal closure conditions as just indicated in 2.5.3. Inductive definitions of 
this kind will come up at several points in our work but again cannot be 
pursued in full. They are discussed in detail in Chapter C.7. 


2.6. Comparison of universes 


There is no simple inclusion relation between universes. Given two 
universes %, and U2 over N, we can compare (N-> N)a, with (N— N)q,, €.g., 
the former may be included in the latter. But once we ascend to higher 
types, e.g., to ((N—N)—N)a,, comparability in the ordinary sense is lost 
because the domains of the functions involved are different. However, 
there is a sense in which we can talk about maximal and minimal universes 
of a certain kind, and which we now wish to indicate. 


Suppose ground-type symbols yo,..., y, are given and sets A,, = A; are 
specified. For simplicity we consider only symbols built up by 
o,T”(a—7T). The maximal finite type structure Umax Over Ao,...,An iS 


determined by: 
A e-+r) = (the totality of functions f: A, > A,). (1) 


This definition is supposed to be construed in some absolute set-theoretical 


sense. 
We speak about structures Uni, only with reference to certain closure 
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conditions. For example, the minimal Cartesian-closed structure over 
Ao,...,An Should be that whose functions are built up only by application 
from all K,,, Sa,.4, To explain this more precisely we consider terms 
s,t,... built up by application from constants K.,, and S,... of respective 
types (o—(t—o)) and (0 >(1—>p))—>[(a > 1) > (a > p)]. These 
terms are then identified according to the rules: 


Kst =s, Sstu = su(tu). (2) 


The objects of the minimal structure are just equivalence classes of terms; 
application is trivially defined. Similarly, one may deal with the minimal 
N-closed structure by considering terms built up using 0,’, K., S...., Re and 
by extending the definition of = to correspond to (1) of 2.3.1. An 
elaboration of this approach brings us to questions of normalization of 
terms which will be discussed in 8.2. 


Note. Clearly, minimal structures U generated by countably many closure 
conditions over countable ground sets will be countable. 


3. Relations of the closure conditions to mathematical practice 


This section is of necessity very brief and sketchy. Only a detailed 
development would give precise meaning to the statements considered and 
then only step-by-step comparison with actual mathematics would consti- 
tute a verification of the assertions. However, the following should give a 
good idea of the kind of relationship that is claimed. For illustrative 
purposes the emphasis is on analysis. 


3.1. Relativization of notions to universes 


Let % be any N-closed universe. 


3.1.1. Set-theoretical notions are relativized to % simply by speaking 
about U-functions and %-sets instead of arbitrary functions and sets. This 
induces a relativization of all further mathematical concepts expressed in 
terms of 3t and in terms of pairing functions, application, sets, and 
membership. In particular, infinite sequences (a,), of objects are identified 
with functions whose domain is N. The classical notions are a special case 
obtained by taking &% to be the maximal finite type structure Una. Over 
given ground types. However, two notions which are equivalent in Umax 
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need not be equivalent in general %. This sometimes requires a judicious 
choice among definitions of concepts for the smoothest development. 


3.1.2. The notion of real number is relativized to % either by taking reals 
to be &%-sets which are Dedekind sections of rationals or to be U- 
sequences of rationals which satisfy the Cauchy convergence criterion. In 
the latter case, reals are identified by an equivalence relation in the usual 
way. The two approaches are equivalent in 3*-closed %: to each Cauchy 
sequence (r,), corresponds the section {r | 3m Vn=m (r<,r,)}. The arith- 
metical operations are easily extended to the %-reals for either definition. 
Note that Cantor’s diagonal argument shows the set of YW-reals to be 
uncountable in U, but of course it may be countable outside of UW. 


3.1.3. The notion of metric space S in U is defined from the reals in % as 
usual. S is called separable (relative to U) if it contains a dense subse- 
quence (d,), in U. S is called complete (relative to U) if every Cauchy 
sequence (x,), of elements of S which lies in U% converges in S. All of the 
spaces treated in the restricted developments are complete and separable. 
These include all spaces of interest in classical mathematics. Basic open sets 
in S are spheres S(x;r)={y | ds(y,x)<r} specified by a center x and 
radius r. The following definition of open set X is most convenient, namely 
as one specified by a U-sequence ((x,, 7,)), where X = U.S (%,3 mn). Note 
that such a union is in &%@ when % is A%-closed. Closed sets for the space S$ 
are complements S— X with X open. If X is closed and (x,), is a 
convergent subsequence (in %) of X, then lim,_..x, © X. However, if X is 
in & and closed under limits in this sense, it need not be the complement of 
an open set in S. It is clear how to go on to define continuous function 
f :S,— S: where f is in U, and S;, S, are metric spaces. Following this 
further we may relativize the notions of normed vector space, linear 
functionals, linear operators, etc. Most interesting functions spaces used to 
illustrate these require theories of integration, which in turn require further 
closure conditions to be considered in 3.2-3.4. Also further notions such as 
compact set, Borel set, measurable set must be delayed for the same reason. 


3.2. Mathematics in 3“-closed universes 


This coincides with the practice of predicative mathematics. We suppose 
throughout 3.2 that % is 3%-closed. 


3.2.1. It turns out that all of analysis to 1900 and a substantial portion of 
twentieth-century analysis can be relativized in a straightforward way to 
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any such %. The actual execution of this is due (in effect) to Weyl (1918) for 
the classical theory of real continuous and complex analytic functions, 
extended further by Kond6é, Lorenzen and Kreisel into a theory of 
measurable sets and functions.® As a final indication of the extent of 
analysis covered one may refer to BisHop [1967] (p. 9), where it is stated 
that each constructive theorem @ found by him is a substitute for a 
standard classical theorem y which follows from @ plus the ‘‘hmited 
principle of omniscience’’; the latter is simply an informal statement of 
closure under 3%. 


3.2.2. Sets vs. sequences 

At first sight it appears that there is a serious obstacle to such an 
extensive development, since the U%-reals are not in general closed under 
sup and inf of bounded sets of reals in &%. An example of such a universe U 
will be given in 7.3. One easily sees the difficulty if reals are treated as 
lower Dedekind sections in @. Then a set A of reals is a subset of A(Q), 
and its sup should be Y= UA =UX[XEAl]. But then rE YO 
AX [X EA and r€ X]; this definition requires quantification over P(Q) 
which is not generally available in %, so that Y need not be a U-set (and 
hence need not determine a %-real). Similarly for the same purpose one 
must quantify over Q™ or N* if reals are treated as Cauchy sequences. 


3.2.3. Compactness, completeness and continuity 

There is no problem in dealing with sup,x, and inf,x, for bounded 
sequences of reals in &% since these are definable in terms of (x,), by use of 
3”. All the mathematics which generalizes to 3™-closed universes goes 
through by dealing systematically with sequences rather than sets at crucial 
points. As an example, one proves the sequential compactness of any 
closed interval [a, b] in R by the usual subdivision argument. Given (x, ), in 
[a, b], we successively divide by halves so that at each stage, [a:, b, ] is the 
leftmost interval which contains x, for infinitely many n. Note that this 
combines use of 3” and R. From this result of course follows the 
completeness of R in its relativized sense. The usual topological notion of 
compactness in terms of arbitrary open coverings is not used at all with the 
given restricted closure conditions. 

The properties of real continuous functions follow readily because it is 
sufficient to deal with their values at rational arguments. Hence 


° Cf. GRzEGORCZYCK [1955] for a more modern reworking of Weyl’s analysis, as well as 
LORENZEN {1965]. For the theory of measurability see LORENZEN [1951] and KREISEL [1962]. 
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sup {fx |a <x <b}, inf {fx |a <x <b} can be proved to exist and to be 
taken on by f when continuous on [a, 5}. 


3.2.4. The following illustrates Bishop’s assertion above. One constructive 
form of Brouwer’s Fixed Point Theorem is that if f is a continuous map of 
the unit disk D into itself, then for each ¢ there exists x with d(fx, x) < e. 
Hence there exists a sequence (x,), in D such that for each n, d(fXn, X,) < 
1/n. The classical conclusion Ax € D (fx = x) is a consequence by sequen- 
tial compactness. However, one can also follow familiar classical proofs 
more directly: one assumes Vx € D (fx # x) and obtains a contradiction. 
(See Sections 11 and 12.4 in Chapter D.5.) 


3.2.5. Integration and measure 

Riemann integration is of course treated sequentially in the standard 
way. More general theories of integration are nowadays derived from 
theories of measure. However, even Lebesgue (outer) measure requires the 
full inf operation on sets for its definition: u*(A)= inf{u(G)| G open, 
A CG}. It is not a problem to define 4 (G) for G = U, (a,, b,); given G 
as a union of intervals one can find such a representation with (a,, b,) 
disjoint and then take 1 (G) = 2%-0(b, — a,). While one cannot make use 
of outer measure, there is no difficulty in dealing with measurable sets A: 
classically, these have the property that for each ¢ >0 there exist open G 
and closed F with FC A CG and uw(G— F)<e. By a measure approxi- 
mation to A let us mean a sequence ((G,, F,)), (in &%) where each G, is 
open, F, is closed, F, C A C G,, and w(G, — F,)<1/n. Then we can take 
(A) = inf, w(G,). Further, we have closure of the measurable sets under 
countable union, when sequences (A,,)m of such are given together with 
measure approximations ((Gin Finn))n.m- Similarly we may deal with the 
theory of measurable functions in terms of approximations by step 
functions. Thus the function spaces L, are available relativized to &. From 
this one may proceed to the theory of Banach spaces and Hilbert spaces 
with the L, as principal examples. 


3.2.6. RELATIVIZED KOniIG's LEMMA. A finitely branching but infinite tree X 
in U contains an infinite branch in U. 


This holds in any 4"-closed universe % and can be proven by the usual 
argument. Taking an example of an infinite recursive binary tree without 
recursive branches, it is seen that the N-closure is not sufficient for this 
result. 
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Sequential compactness of R can be viewed as a consequence of K6nig’s 
Lemma. It also implies compactness in first-order predicate calculus. To 
carry out first-order model theory, one needs the satisfaction relation for 
any structure. This is provided for any countable structure by combining 3” 
and definition of a sequence of predicates (Sat,), by recursion R. Thus 
countable model theory relativizes to any 3%-closed universe. 

Another noteworthy consequence of K6nig’s Lemma is Ramsey’s 
Theorem (for a proof of the implication see Chapter B.3). 


3.2.7. Relativization to N-closed structures 

Closer inspection of the arguments shows that the relativization of 
analysis described above need use throughout only the elementary recur- 
sion operators R together with the K, S and 3%. This depends on working 
systematically only with sequences and with countable dense subsets of 
spaces. However, the possibility of this restriction is not always immediate. 
For example, in the Cauchy-Lipschitz Theorem on existence of local 
solutions of first-order differential equations we define a sequence f, of real 
functions by successive approximation involving integration. This standard 
proof goes through without change if we use R but must be handled more 
carefully when using only R. On the other hand, the full operators R are 
needed to define satisfaction in countable structures. This will follow from 
results of Section 8. 


3.3. Well-foundedness closure conditions 


The countable ordinals and associated definition by recursion and proof 
by induction enter if one wishes to speak of Borel sets or Baire functions or 
the sequence of derivatives of a closed set. One may also mention here their 
use in algebra, e.g., in the structure theory of countable Abelian groups and 
modules. To relativize these in a natural way, one would wish to consider 
certain closure conditions using ordinals as one ground type. Alternatively, 
one could deal with ordinals via well-ordering relations. 

A form of the test 3 |~ for well-foundedness may be used to define the 
analytic sets in descriptive set theory. The recursion-theoretic hierarchies 
which provide analogues of the Borel, analytic and other classifications also 
make heavy use of generalized inductive definitions, well-orderings, and 
transfinite induction and recursion in their definition and development. For 
example, the statement that any two well-orderings are comparable is used 
to prove the reduction and uniformization theorems for II; sets. 
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3.4. 3"-closed structures 


It is already evident from 3.2 where quantification over R enters analysis 
in an essential way, namely in the general theory of measure. One may also 
add the definition and study of projective sets in descriptive set theory. 

As shown by FriepMAN [1970b], even 3"-closure of % is insufficient to 
generalize Borel determinacy to such U. Martin [1975] has shown that 
impredicative transfinite types are essential for his solution of this problem. 


3.5. The Axiom of Choice 


This is not a closure principle of the kind we have been discussing in the 
preceding sections and does not follow from them. The uncountable axiom 
of choice is used to obtain such mathematically significant results as the 
Hahn-Banach Theorem, maximal ideals in Banach spaces, etc. BIsHoP 
[1967] has found constructive substitutes for certain of these. His work 
suggests that there may be versions provable using some of the above 
closure conditions and which serve for concrete applications. The count- 
able axiom of choice is true in some natural N-closed and 3%-closed 
structures, as we shall see in Section 7. In addition, we shall discuss results 
in Section 8 for some formal systems of finite type which show eliminability 
of certain forms of countable AC. 


4. Theories of finite type based on the closure conditions 


4.1. Syntax 


4.1.1. Types 

To simplify the logical work, from now on we shall use the reductions 
indicated in 2.1.4(1). The type symbols sufficient for all our purposes are 
generated as follows: 

(i) 0 is a type symbol. 

(ii) If o, 7 are type symbols, then so also is (a > 7). 
0 is the type symbol for N; (ao > 7) is sometimes also denoted (a )r. The 
type symbol n is defined by induction with 


n+1=(n—0). (1) 


Each language L to be considered has a specified set of types Typ(L) 
containing 0 and closed under subtypes. The level of a type symbol is 
defined by 
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lev(0) = 0, lev(o — 7) = max(lev(a) + 1, lev(r)). (2) 


Thus lev(n) = n. We write (a1,..., 0, 2 T) for (9,2 --:— 0, > 7), where 
parentheses are associated to the right. Note every o#0 is of the form 
(o1,...,0, — 0). 


4.1.2. Terms 

Every L to be considered has infinitely many variables of each of its 
types. These are denoted by a, b,c, x, y, z. (The type superscript a” is used 
with variables only when necessary to avoid ambiguity.) We also use 
k, n, m, p as variables of type 0 and f, g,h as variables of any type (a > 7). 
L is specified by a certain set of constant symbols of various types as well as 
operators and predicates of level 1, i.e., having only arguments of type 0. 
Among these always are the constant 0 of type 0, the operator Sc on type 0 
and the predicate = between type 0 objects. The terms of various types 
are generated as follows: 

(i) Each variable and constant of L of type 7 is a term of type r. 
(ii) If t is of type (a7) and s is of type o, then ts is of type 7. 
(iii) If F is an n-ary level 1 operator of L and ¢, is of type 0 (1<i<n), 

then F(t,,...,¢,) is of type 0. 
We write t’ for Sc(t), ¢ for t,,...,t,, and st for st,-- > t,. 


4.1.3. Formulas 

The atomic formulas are all those of the form P(t,,...,t,) where P is an 
n-ary level 1 predicate of L and ¢, is of type 0 (1=i =n). In particular, 
s =t is always a formula for s, t of type 0. The formulas are generated by 
—,A,v,—,V,d, the quantifiers being applied to variables of any type in 
L. 6, ¥,9,...,6(a), d(a, b),... range over formulas. Any formula may 
contain free variables (parameters) other than those indicated. By lev(@) 
we mean max(lev(t)) such that ¢ occurs in @. If ¥ is any set of formulas, 
n-sec(#) denotes the set of @ in ¥ with lev(@) =n; this is called the 
n-section of ¥ or the (n+1)-st order part of ¥. For each 7, variables 
X, Y, Z are introduced by convention to range over (7 — 0) (which in this 
case might be denoted [{7]), and we write 


aE€X-Xa =0. (1) 


4.1.4. Equality at higher types and extensionality 
For any type 7, we define the predicate =, of equality between objects of 
type 7 inductively by 
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f =e+g Va" [fa =,ga]. (1) 


Thus for t = (a1,..., On 70), f=,g @Va‘',..., a%[fa = ga]. In order for 
these relations to satisfy the laws of equality one would have to accept all 
formulas 


(Ext) a=,b— fa =.fb 


where f is of type 0 — r. From (Ext) and the laws of equality for type 0 we 
can then prove [a=,ba ¢(a)— ¢$(b)], i-e., that definitionally equal 
objects are indiscernible. 

An alternative to this treatment would be to take =, as a basic predicate 
for each type 7 and the laws of equality in all types as basic axioms. Then 
extensionality is expressed by the scheme 


Vx [fx =. 8x] > f =n 8. (2) 


The principle (Ext) above has the same effect in our approach using 
definable equality at higher types. It is perhaps more natural for the 
mathematical applications to take equality at all types as basic, though 
extensionality need not be assumed. Our approach has some technical 
advantages, but we must show that (Ext) is eliminable; this will be 
discussed in 4.4.2. 


Note. The subscript will be dropped in higher type equality when there is 
no ambiguity. 


4.1.5. Number-theoretical axioms 

Each theory T to be considered will use classical logic (unless otherwise 
specified) and will contain the following axioms for 0 and Sc: 

(i) x' 40, 

(ii) x'=y'> x =y. 
Let L = L(T) be the language of T. By the induction scheme for N in L we 
mean all formulas of the form 


(Ind") $(0) Wn [d(n)—> o(n')]> Vn d(n) 


where @ belongs to L. Unless otherwise restricted, each T will contain the 
full induction scheme (Ind™) in this sense. The only other case to be 
considered, and which is used only when 1€ Typ(L), is the following 
(restricted) induction axiom for N: 


(I*) OE XAWn[nE X—n'E X] Vn (nE X). 
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4.1.6. Relations between theories 

We put T, C T, when L(T,) € L(T:) and every theorem of T, is a theorem 
of T,. We put T, =T, if T,; CT, and T, CT, (so L(T,) = L(T.)). 

& is used to range over sets of formulas in a language L = L(T). Unless 
otherwise specified, each ¥ is assumed to contain all basic formulas 
(atomic formulas and their negations) and to be closed under a and v (at 
least up to equivalence). The simplest example of such ¥ that we shall 
consider is QF, the set of all quantifier-free formulas in L. 

T. is said to be a conservative extension of T, for ¥ if T:2T, and 
whenever @¢ € ¥ and T.}' ¢, then T,} @. T; is said to be a conservative 
extension of T, if this holds for the set ¥ of all formulas in L(T,). 


4.1.7. Schemata 

Let o be some type in L and P(a) a predicate of variables a of type a, 
which is not a predicate symbol of L. We write L[P] for the language 
extended by P, and #s[P] for some fixed sentence of L[P]. Given any 
formula ¢(a) of L, we take Ws[¢/P] to be the result of substituting o(t) 
for P(t) at each occurrence of the form P(t) in ws[P]. @ may contain free 
variables other than a; these are called parameters of the substitution. If 
there are no such variables, we say that Ws.[@/P] is without parameters. 

More generally, given any list @ =(o1,...,0,) of types in L and 
predicate P(a) of variables a = (aj',..., as") we may define &s[@/P] for ws 
in L[P] and ¢(a) in L. By a scheme (S) in L we mean the set of all formulas 
(called the instances of the scheme) of the form ws[¢/P] for @ in L, 
associated with some fixed #s[P]. By (S) we mean the scheme using only 
instances without parameters. On the other hand, by (¥ — S) we mean the 
set of all instances #s[¢/P] for PE F 

Still more generally, schemes may be determined as instances of given 
Ws[Pi,..., P,]. In practice we shall describe schemes in terms of their 
instances. 

For illustration, the scheme of induction (Ind”) in L is determined by the 
formula P(0)~ Wn (P(n)— P(n'))—VnP(n). Then (QF - Ind“) denotes 
the set of all instances 6(0)aVn(d(n)— (n’))—VWnd(n) for ¢ 
quantifier-free. 

When dealing with o of length n =1 we can equivalently describe 
schemes in terms of formulas w,(X) where X is of type (a 0); then 
Ws[@/X] is the result of substituting $(t) for (t € X) at all such occur- 
rences. In this sense (Ind) consists of all instances of I". 
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4.2. Comprehension and choice schemata 

4.2.1. By the Axiom of Choice scheme we mean all formulas of the form 
(AC) Va Ab¢(a,b)—> Jf Va d(a, fa). 


(AC,,..) denotes the restriction of this scheme to variables a’, b’. (AC, -) 
denotes U, (AC.,,,). 


4.2.2. The (general) comprehension axiom scheme for sets is given by: 
(CA) aX Val[aEex<d(a)]. 
When the matrix ¢ of (AC) satisfies the unicity condition 

$(a,b)n d(a,c)> b=, (1) 
we have a form of comprehension axiom for functions. In particular, for any 
¥ and w(a), O(a) in ¥ take 


d(an)ew(a)an=O0v O(a)an=1. 
Then 

Va[W(a)o—- 6(a)| > Va Alnd(a,n) 
and the following special comprehension axiom scheme for sets is a 
consequence of (# — AC): 


(As — CA) Va[u(a)o 6(a)|> 3X Va[aecxoy(a)] 


for each W,@ in ¥. 

The class ¥ = QF of quantifier-free formulas provides the simplest cases 
of interest of the relativized schemes in the following. Note that if ¥ 
denotes the set of @ equivalent to — ¥ where p € &, then (# — CA) and 
(¥ — CA) are equivalent. Some evident second-order cases to consider are 
obtained using the classes II2, 21, Il, ete., to be described in 5.1. 


4.3. Defining axioms for finite type constants 


The axioms for K, S and R™ are given in each appropriate combination 
of types as follows: 


(K) Kab=a 
(S) Sfga = fa(ga) 
(R) Rfa0 = a, Rfan' = fn(Rfan). 


Recall that equality at higher types is considered to be defined according to 
4.1.4. As noted in 2.3.1, all primitive recursive functions of natural numbers 
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can be defined using (K), (S), and (R), but also a recursive function which is 
not primitive recursive can be obtained. 


Note. The use of the A-notation for functions is justified in any theory 
containing the axioms (K) and (S). 


Another particular consequence of (K), (S), (R) is definition by cases: 


a ifn=0, 
Dabn = (1) 


b if n#40. 


On the other hand, one can prove the existence of these various functionals 
from suitable simple instances of (AC): namely, K, S, D from (QF — AC) 
and the R-functionals from (# — AC) where ¥ consists of formulas in 
(finite type) existential form. 


4.4. Number theory in finite types 


4.4.1. The language considered here uses all finite types generated from 
type 0, and all constants K, S, and R_ besides 0 and Sc. By the theory of 
numbers in finite type, denoted Z*, we mean the system with the following 
axioms: 


z° (i) the axioms for 0, ’, 
(ii) the full induction scheme for N, 
(iii) the axioms (K), (S), 
(iv) the axioms (R). 


We shall consider extensions of Z° by various axioms and axiom schemata, 
e.g., Z° + (Ext), Z° + (QF — AC). In 4.6 we shall also consider finite type 
subsystems of Z° with restricted induction and/or elementary recursion 
operators. 


4.4.2. Elimination of extensionality 

We shall illustrate this for Z° + (Ext). The same argument, which is due 
to Gandy (cf. LuckHArpt [1973]), works for many other theories. Define 
E,(a) and a =, b (a, b variables of type 7) for all 7 inductively as follows: 


(i) E(nj)e(n=n), (n=om)o(n=m). 
(ii) E,..,(f) Va (E. (a)— E, (fa)) Va, b [a =, b > fa =, fb}, 
f Seg Va (E,(a)— fa =, ga). 
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We now translate formulas @ of the language L of Z° into new formulas 
o of the same language, by replacing each quantifier Va’ (---) in @ by 
Va’ [E,(a)—---] and similarly for da’ (---). Note that Vf' E.(f) holds 
and more generally Vf" E,(f) holds for any 7 of level 1. Thus if @ is a 
formula with all bound variables of level =1, @ is equivalent to @. 


THEOREM. If Z° +(Ext)+ @ where @ is closed, then Z°| h(E). Hence 
Z° + (Ext) is conservative over Z° with respect to second-order formulas. 


Proor. It is easily shown in Z® that each constant K, S, and R of type 7 
satisfies E,. Thus the inner model provided by (E,), includes all the 
constants; the axioms of Z” clearly relativize. To prove ¢“, where ¢@ is 
any closure of an instance of (Ext) (4.1.4), note that E,(a)aE,(b) 
> [(a=.b)°’ea=,b]. DO 


4.4.3. Countable and dependent choice schemes over Z* 
The countable axiom of choice scheme in type 7 consists of the formulas 


(AC,,,) Vn 4bd¢(n, b)—> Sf Vn d(n, fn). 
Related to this is the scheme for dependent choices 
(DC.) Va 3b d(a, b) > Va Af [f0 = a »Vnd(fn, fn’)] 


with a, b of type 7. A neat generalization of the two is: 


(GDC,) Wn Va 3b ¢(n, a, b)—>Va of [f0 = a nWno(n, fn, fn’)]. 
Lemma. (#— GDC,) follows from (# — AC,,,) in Z°. 


Proor. (¥ — AC,,,) implies Vn Va 3b ¢(n, a, 6) Sh Vn Va d(n, a, hna). 
Use (R) to obtain f with f0 =a, fn’'=hn(fn). O 


4.5. Systems with various quantification operators 


These systems are suggested by the closure conditions considered in 
Section 2.3. 


4.5.1. The numerical quantification operator 
The language of Z” is enlarged by one new constant symbol 3” of type 2, 
with the axioms 
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(a) (i) AYX =Ova"X =1, 
(ii) BYX =004n(nE xX), 


where X is of type 1. The resulting system is denoted Z° + (4%). 


4.5.2. Type 1 quantification operators 
To quantify over type 1, i.e., (0-0), which intuitively denotes (NN), 
we use a type 3 constant 3*~™ with the axioms 


(ay"*) i) BNO XK = OV aN X = 1, 
(ii) AN’ X =00Ff (fe X) 


where X is of type 2 and f of type 1. 

Corresponding to the test for descending sequences discussed in 2.5 we 
have an operator 3 |“ which quantifies over type 1 but in a more special 
way. Namely, 3 |“ is a constant of type 2 with the axioms 


(aal’) (i) 2% X=0va |" X=1, 
(ii) A |" X=04f Wn (fn’, fn) eX, 


where X is of type 1, and (m, p) is a primitive recursive pairing function. 
The resulting systems are denoted Z* + (3“~™) and Z* + (4 |“), respec- 
tively. 


4.6. Systems with restricted induction and/or recursion 


4.6.1. Restricted induction 

For any theory T considered above or in the following, restricted-T 
denotes the same theory with the induction scheme (Ind") replaced by the 
induction axiom (I) of 4.1.5. When T contains full (CA), then T= 
restricted-T. In general, though, restricted-T is much weaker than T; a 
number of examples illustrating this will appear at various points below. 


4.6.2. Elementary recursion 
The elementary recursion operators in the sense of 2.3 (introduced by 
Kleene) are given by constants R with the axioms 


(R) (i) Rfanb = ab, 
(ii) Rfan'b = fn(Rfanb)b 


in all appropriate combinations of types for which ab is of type 0. Thus = 
is =) in (R). (If the list b = b?',..., b% is empty, then a@ must be of type 0; 
otherwise a is of type (a1,..., 0% —0).) We denote by Z° the system Z° 
with the axioms (R) replaced by the (R). As we have observed in 2.3, the 
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primitive recursive functions are exactly those defined by closed terms of 
level 1 in Z”. 

We can of course go on to consider the subsystem restricted-Z”. When 
formalizing the part of mathematics which generalizes to all N- and 
4™-closed structures as indicated in 3.2.7, it may be seen that induction is 
applied only to arithmetical properties, i.e., in which all quantifiers range 
over N. In other words, it is formalized in restricted-Z” + (3%). Naturally, 
the verification of this requires detailed consideration of the mathematical 
arguments utilized. 


Norte. The proof that (¥ — AC,,,) implies (¥ — DC,) in Z”, given in 4.4.3, 
works only for 7 =0 in Z”, and in that case (¥ — Ind’) suffices for the 
argument. 


5. Some second-order comprehension and choice schemes 


5.1. Special classes of second-order formulas 


Consider the language L with just the types 0,1. Take all primitive 
recursive functions as level | operators of L (or any subset from which 
these can be defined arithmetically, e.g., +, -). A formula is said to be 
arithmetical if it contains no bound variables of type 1, though it may 
contain free variables of that type. The class of all such is denoted II°. The 
subclass of IT?-formulas consists of all those in the form Vnw where w& is 
quantifier-free. Proceeding as usual in the classification of analytic predi- 
cates, ¢ is said to be in II-form if @ = Vf where w is arithmetical, and in 
Li-form if 6 =Wfw with w arithmetical. Similarly, the I]:-formulas (22- 
formulas) are those of the form Vf dg #(Af Vg yw) with & arithmetical. We 
also use the designations IT!, I2, I;, 21, etc. for the corresponding classes 
of formulas. 

By (Ai— CA) we mean (As — CA) for ¥ = 2); similarly for (A;— CA) 
and ¥ = 3}. On the other hand, the classes Aj, A}, etc. make sense only 
relative to a given theory T: @ is in Aj (relative to T) if there are w € &), 
6 E II; such that T proves (6 W)a(W <8). 


5.2. Second-order subtheories of Z° 


These are formulated in the language using only types 0,1 including 
constants for all primitive recursive functions; we have seen that this may 
be considered part of Z°. By second-order arithmetic we mean the system 
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Z’ in this language with (i) the axioms for 0, 1, and for each primitive 
recursive function, and (ii) the full induction scheme for N. Again restricted - 
Z takes induction only in the restricted form. Every second-order system 
considered below includes Z’ or restricted-Z’. Z denotes the usual system of 
first-order number theory. 


5.3. Second-order comprehension and choice schemes 


Let ( , ) be a primitive recursive pairing operation from N X N onto N; 
for any f,n let (f),m = f(n, m), i.e., (f), = Am -f(n,m). This allows us to 
formulate second-order forms of ACy,,, DC,, and GDC, as follows: 


(AC) Vn 3g b(n, g)—> Af Vnd(n, (fn), 
(DC) Vg Ah b(g,h)> V8 Af [(f)o= 8 AVN O(A)m Pn)] 
(GDC) Wn Vg Sh o(n, g,h)> V9 Af [(f)o= 8 AVN b(n, (fn (fn)]- 


(Note that we are dropping type subscripts in the designations of these 
schemes.) 

Each system obtained from Z’ by adjoining a scheme (S) will be named 
simply by its additional scheme (S), e.g., (21 - DC) denotes Z’? + (21 — DC). 
Similarly, restricted-(%|— DC) denotes restricted-Z’ + (Z| — DC). 


5.4. Second-order subsystems of some finite-type theories 


5.4.1. Arithmetical comprehension in Z” + (3%) 

Note that this system expresses the condition of closure under 3” for 
finite type structures. 

All arithmetical formulas are equivalent to formulas built up by v, 7, 
dn. This build-up can be reflected by terms, with primitive recursive 
functions taking care of the atomic formulas and propositional connective 
and the constant 3% taking care of numerical quantifiers. In other words, 
we obtain: 


Lemma. For each arithmetical ¢ there is a term t with the same free variables 
as @ such that (@ <t =0) is provable in restricted-Z” + (3). 


Corotiary. (i) Z” + (4%) D (TIS — CA). 
(ii) The same holds for the corresponding restricted systems. 
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5.4.2. Adding choice 


THEOREM. (i) Z” + (A*) + (QF - AC) 2 (II2 — AC). 
(ii) Z° + (3%) + (QF - AC) 2 (IZ - DC). 
(iii) The same results hold for the corresponding restricted theories. 


Proor. (i) is immediate by the lemma of the preceding section. For (ii) we 
also use (the proof of) the lemma of 4.4.3. 0 


5.4.3. Easy comparisons of these second-order systems 


THEOREM. (i) (II) - CA) = (2 — CA). 
(ii) (I? -— AC) = (2 — AC) = (2) - AC). 
(iii) (1? - DC) = (M2 — DC) = (21 - DC). 
(iv) (2 - CA) € (Ai ~ CA) C (21 - AC) € (21 - DC). 
(v) The same results hold for the corresponding restricted theories. 


Proor. We mention only three points which are involved. First, each 
arithmetical formula ¢ is shown (by induction) to define a set in (II? —- CA); 
the build-up of @ is reflected by operations on sets — X, X UY, ANX. 
Second, if ~ = dh d(g,h) where ¢ is arithmetical, then 4g y is equivalent 
to dg }((g )o, (g):). Hence choice principles for X|-matrices reduce to those 
for TTZ- and thence [T}-matrices. O 


Nore. (i) (IZ — CA) has a finite axiomatization over Z’. The idea for this is 
contained in the first part of the proof just indicated. It is similar to the 
finite axiomatization of the Bernays—Gédel theory of sets and classes (with 
type 0 corresponding to sets and type 1! to classes). 

(ii) It is essential for the proof that the formulas involved may contain 
second-order parameters. There is occasionally reason to consider the 
various schemes without parameters, as will be seen in Section 7. 

The following generalizes (iv) above: 


THEOREM. If # DIT, then (As — CA) C(F¥- AC) C(¥- DC). 


5.4.4, The effect of choice with (A hy) 


THeoreM. (i) Z°+(4 1 “)+(QF- AC) proves (23—- DC). 
(ii) (A!- CA) = (35—- AC) =(3!- DC). 
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Proor. (i) First note that (£}— DC)=(IIi— DC). The latter reduces to 
(QF —- AC) in Z° + (3 |*) by arguments previously used. 

(ii) (23-—_ DC) C (A}— CA) is by formalization of the Kond6-Addison 
Theorem, according to which any IT; relation between type 1 functions can 
be uniformized. Standard proofs (see, e.g., RoGers [1967]) make use of 
abstract countable ordinals. However, these can be treated by well- 
orderings in N as soon as we have comparability of well-orderings. The 
proof will be completed in Section 6. 


Nore. Addison and Kleene observed that the A}-predicates of natural 
numbers are closed under (3 |“); a formal version of their argument is 
immediate. A} seems to be the syntactically simplest class with this 


property. 
5.5. A restricted system conservative over Z 


5.5.1. The following shows the weakening which can take place when 
induction for N is restricted to the axiom (I‘). The result will be 
strengthened in Section 8.7. 


THEOREM. (i) Restricted (12 - CA) is a conservative extension of Z. 
(ii) Restricted Z° + (4%) is a conservative extension of restricted (It — 
CA). 


Proor.” For (i) show that any model Dt of Z can be transformed into 
a model MM of restricted (IIZ— CA). Take as the functions of the new 
model ? all f: M—M definable in Qt by an arithmetical formula @ 
using parameters in M. Then the induction axiom OE XA 
Vx [x € X > x'€ X)— Vx [x © X] reduces for each X to an instance 
of the elementary scheme of induction, assumed true in M. 

For (ii) we proceed similarly, only now beginning with an Yt? and 
forming a model Y* (not necessarily extensional) with domains M,. These 
are defined by induction on lev(7). Given 7 = (a1,...,0, 0) and arith- 
metical @(av'--- ar, y) permitting parameters from Mo, M,, or of level 
<lev(7), such that Va, € M,,--:Va, €& M., Aly }(a, y), define f,a = y iff 
(a, y); M, is taken to consist of all such f,. It may be shown that each 
constant K, S, R has an interpretation in Yt and that no new functions of 
type 1 arise. O 


5.5.2. Restriction is indeed a weakening in these cases: 


“ Note added in proof : The proof of part (ii) is not adequate as it stands, though the result is 
correct by the (even stronger) second theorem of 8.7. 
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THEOREM. The consistency of Z is provable in (TI — CA) 


Proor. There is a formula w(n, X) which expresses that X is the set Tr, of 
true first-order sentences of logical complexity <n. Using full induction 
for N we prove Vn AX y(n, X). The step VX AY [W(n, X)— wn’, Y)] is 
elementary by the way in which Tr,.; is arithmetically defined from Tr,. 
Then we may apply induction again to prove that everything provable from 
Z is true. O 


6. Transfinite induction, recursion and iterated principles 


The notions here can be applied to relations between type 7 objects for 
any 7. However, most of the work in our subject concerns type 0. For 
example a common program is to characterize the arithmetical part or the 
l-section of a finite type theory in terms of schemes of induction and 
recursion with respect to particular recursive well-orderings of natural 
numbers. 


6.1. Well-foundedness and induction 


6.1.1. Well-foundedness of orderings 

Every X CN may be considered as a binary relation consisting of the 
pairs (k, m) with (k,m)€ X; we write k <xm for (k,m) € X. Conversely 
each binary relation < determines the set X of all (k, m) with k < m. We 
shall be considering relations given by formulas w: 


(k <m)= W(k,m), (1) 


where w may contain parameters. Then the well-foundedness of <_ is 
expressed by the formula 


WF( <)= 735f Vnu(fn', fn). (2) 
In particular, in Z* + (3 1”), WF(< x ) is equivalent tod |“ X=1. Clearly 
WF( < ) implies that < is irreflexive; we do not assume that < is transitive 


or connected. Finally, WF(<) is also written WF. when (k < m) is given 
by a formula (1) without parameters. 


6.1.2. Induction on orderings 

As we know on general set-theoretical grounds well-foundedness implies 
principles of transfinite induction and recursion. The latter will be consid- 
ered in 6.2. 
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For any < the scheme TI{ <) of transfinite induction with regard to < 
(in any given language) consists of all formulas of the form 


TI(<,¢) Wm [Vk < mo(k)—> o(m)] > Vm o(m), 


where Wk < m¢(k) abbreviates Vk (k < m — $(k)). The hypothesis of 
TI(<,@) is sometimes called the progressiveness of ©. 

When < is <x we also write TI(X, ) for TI(<,@). When < is given 
by a formula without parameters we write TI.(#). TI(<, Y) is the instance 
of the scheme using the formula #(k) =(k € Y); thus we may also write 
TI(X, Y) or TI.(Y) in case < is <x or is defined without parameters. 
Finally, as with all our schemes, we may indicate restriction to a class ¥; 
(¥ — TI(< )) consists of all TI(<,@) with @ in F. 


6.1.3. Induction on trees 

Take a primitive recursive representation (ko,...,k,-1) of finite se- 
quences by numbers; s, tf range over such finite sequence numbers. When 
5 = (ko,...,kna-1) put Ih(s) = n, s, = ki, and s fm =(ky,..., km-1) form =n. 
We write Cs if t=sf{m for some m <Ih(s), and¢t Cs fortCsat#s. 
Finally, s *(1) = (ko,..., Kn-1, !). 

We say that X is a tree and write Tree(X) if Ws,t[seE X atCs— 
t € X]. With each tree is associated the relation s <x tf defined by sE X a 
tC s. (This is not to be confused with <, defined as in 6.1.1; the cases are 
distinguished by the styles of variables which are used.) For f of type 1 let 
f(n) = (f0,..., f(a — 1)). Then under weak hypotheses Tree(X) implies 


WF(<x )@a5f Vn [f(n)€ X]. (1) 
Under these hypotheses TI(<x, @) translates into 


Vs [sé X > O(s)] AVWs [Vk o(s *(k))—> o(s)] > Vs G(s). (2) 


6.1.4. Bar induction (well-foundedness implies induction) 
For any formula < the scheme BI( <) of bar induction with regard to < 
consists of all instances 


BI(<,¢) WF( <)> TI(<,¢). 


The terminology was introduced by Brouwer for principles applied to 
trees. Using (1) and (2) of 6.1.3, the scheme translates into a statement of 
bar induction very close to that used in the intuitionistic literature (cf. 
Chapter D.7, Section 10). 

As with TI we may use the notations: BI(X, é), BI(<, Y), (# — BI.), 
etc. 
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6.1.5. Some provable cases of bar induction 


THEOREM. (i) (2 - CA)F[WF(X)@VY TI(X, Y)]. 
(ii) If FIM, then (F¥ — CA)+ BI(<x, ) for each @ in ¥. 


ProoF. (i) Suppose WF(X), i.e., there are no descending sequences in 
<x. Given Y such that Vm[Wk<m(kEY)>meEY], to show 
Vm (m € Y) holds. If not, there exists mo ¢ Y. Further with each m € Y 
we can associate h(m)< m having h(m) € Y. By (IZ — CA) there exists f 
such that f0 = my and Vnf(n') = h(fn), contradicting WF(X). Conversely, 
given VY TI(X, Y) and arbitrary f, in order to show 4 Vnf(n’) <x f(n) use 
Y defined by: 


m © Y 74 no (fro <xm AWn = nof(n’) <x f(n)). 


(ii) is an immediate consequence of (i). O 


Of course under (¥ — CA) we can also write (ii) with BI(<, ¢@) for any 
relation < in &. 

Though (¥ — CA) is formally required to establish (¥ — BI), the latter 
principle is generally perceived as being more elementary or basic than the 
former (particularly for familiar <). 


6.1.6. Relationship with IT|-comprehension 

By recursion theory, every TI} predicate @(m,g) of numbers and 
functions can be brought to a normal form Wf dn P(m, f(n), g(n)) where P 
is primitive recursive’; this argument uses only (Ilz—- CA). The normal 
form may be read as expressing well-foundedness of the tree of s for which 
Vk <lh(s) 7P(m, s | k, g(k)). Hence we have the following: 


Lemna. (Il}-CA)C Z° +(4 J). 


Our interest now is to compare the second-order systems (IIZ—CA) + 
(BI) and (IT; — CA). This will be a comparison as to strength since, as will 
be seen, neither is contained in the other. By 6.1.5 we know only that 
(IIi— BI) and (2;- BI) are contained in (I1}— CA). The following was 
observed by Kreisel®: 


’ For all such recursion-theoretic results the reader is referred to RoGers [1967]. 

* This, and various of the results below credited to Kreisel without reference, were first 
presented in the 1963 “Reports of a seminar on the foundations of analysis at Stanford”’, 
which was circulated but not published. A number of these are mentioned in KREISEL [1968]. 
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THEOREM. (IIi— CA) proves the existence of an w-model for (12.— CA)+ 
(BI). (< primitive recursive ). 


Proor. The idea is to formalize the recursion-theoretic w-model defined 
as the (enumerated) collection M; of all functions arithmetic in the set © of 
constructive ordinal notations. By the Kleene basis theorem (leftmost 
branch selection) for P primitive recursive 


(Af € M,)Wn [f(n) € P] Af Wn [f(n) € P]. 


The existence of M, is provable in (IIi{— CA), as well as the definition of 
satisfaction for second-order formulas in M2. Then for any second-order @ 
with parameters in M,, the set defined by @ relative to M; is proved to exist 
in (IIj — CA), so BI(P, #) holds by the result of the preceding section. O 


There is a recursion-theoretic w-model M, of (II2.— CA) + (BI) using the 
same idea, relativized and iterated. Namely, take M2 to consist of all f 
recursive in some ©, where Oo=N and O,.,= 0". Though a statement 
which expresses that ©, exists (for all n) can be proved in (II; - CA) and 
hence M, can be defined there, the truth definition for M; is not definable 
there. This gives interest to the following result of FRIEDMAN [1969], to 
which we refer the reader for a proof. 


THEoREM. (I1}— CA) proves the existence of an w-model for (112 — CA) + 
(BI); it is conservative for Tl} sentences. 


6.2. Transfinite recursion 

For a tree X CN, one form of definition by transfinite recursion on X ts: 
fs = gs for sé X, fs = hs(An, f(s *(n)) for s © X, where g,h are given 
functions. However, h must be of type 2 in this definition. Because of this 
transfinite recursion is more naturally dealt with in a finite type theory. For 
simplicity we shall only formulate here some second-order forms of 
recursion. Given a tree X, define (fxs) of type 1 by: 


ft iftCs andsEXx, 
gies} (1) 


0 otherwise. 


For a relation <x define 
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fm ifm <xn, 
(fl<.n)m = (2) 
0 otherwise. 
We write TR(X, ¢, f) for Vs b(s, f tx s, fs) when Tree(X) and TR(<x, & f) 
for Vnd(n, fl<,n, fn) in connection with relations. This is appropriate 
notation when ¢(n, g,m) determines a functional m = hng, i.e., when 
Vn, g d!m(n, g, m). 


THEOREM. In (A;—CA), 2i-—TI(X) implies VWnVg Almd(n,g,m)> 
S!f TR(X, d, f) when o is 3S). 


Proor. By inspection of the standard argument for deriving transfinite 
recursion from transfinite induction. © 


Let CWO be the sentence expressing that if <x and <y are any two 
well-ordering relations, then <x is isomorphic to an initial segment of <y 
or vice versa. 


Coro.iary. (Aj}— CA)+ (2; - BI)E CWO. 


As we have noted in 3.3, CWO is much used in the recursion-theoretic 
treatment of ordinals and in particular in the theory of © and II; and 
hyperarithmetic predicates. For application of the theorem on recursion to 
hierarchies cf. 6.4. 

We can also now complete the proof of the theorem in 5.4.5. In 
(A; — CA) we can derive (Ai— CA) and (2; — BI), hence CWO. This is all 
we need to make use of well-ordering relations in place of ordinals. 


6.3. Provably recursive well-orderings 


6.3.1. Ordinal functions and associated well-orderings 

For various specific ordinals a we have natural primitive recursive 
well-orderings <, of order type a. These are given by orderings between 
terms built up from symbols for given ordinal functions. The most familiar, 
using +,:°, exp, and Cantor normal form represents the segment up to 
a@ = €y=lim,@, where wy = 1, @.+; = w". This is extended using successive 
critical functions x“ defined by 

(i) K%a)= w°, 

(ii) « enumerates {a |Wu < v«“(a)= a} for v >0. 
Thus €) = «‘(0). Let I be the least ordinal such that v,a@ < I, implies 
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k(a)< Fo; it may alternatively be described as the least solution of 
«(0) = v. A natural primitive recursive well-ordering <,, is obtained by 
using terms for the functions +,+,exp and Av,a-«(a) (cf. ScHUTTE 
[1960], FEFERMAN [1968]). For the present purposes when considering <, it 
is sufficient to take a =I, and <, is the ordering of the initial segment of 
length |a| = a. We write a @ b and w* for primitive recursive functions on 
<,, which make |a@b|=|a]+[b| and |w?|=o'. TI(a, @) is TI(<., ¢) 
and TI(a) is the scheme of all such in any given language. Finally, I(a) is 
written for VY TI(<., Y); this is equivalent to WF(<.) in (II2 — CA) by 
6.1.5. Throughout, when a is specified and a =|a| we also write a for a, 
e.g., TI(a). 


6.3.2. Induction below & 

Gentzen proved that Z+ TI(a@) for each a@ < &€y and that this is best 
possible by showing TI(€o, @)—Con(Z) for a certain (quantifier-free) ¢. 
SCHUTTE [1960] has given a simple proof of the first fact. With @(n) in any 
language we may associate ¢ *(n) which is built up by numerical quantifica- 
tion from @ as follows: 


$*(a) ob [We < bb(c) > Ve < b+ w* (c)). (1) 


Lemma. TI(a, 6*)— TI(w’, @) is provable in any extension of Z. 


The proof uses that if @ satisfies the progressiveness hypothesis of TI, so 
does o*. 


Corouiary. If T is any extension of Z and a < &o, then T+ TI(a@). 


The reason mentioned above for the bound €9 on Z will be discussed 
further in 8.2; in fact, it is the bound for any provable arithmetical 
ordering. 


6.3.3. Strengthening a theory does not necessarily increase its stock of 
provably recursive well-orderings; e.g., this is the case with Z+ Con(Z). 
However, adding stronger comprehension principles usually has that effect. 
The following shows the matter to be delicate. First, in ([2—CA) the 
previous lemma allows us to prove VY,4Y2[TI(a, Y.)—> TI(w%, Y,)], 
hence I{a)— I(w*). By induction we then obtain I(a)—I(e(a)) where 
e(a) represents the next e-number beyond a. Hence we also have 
I(€,)— I(€5+1) in this theory. Applying TI(a, @) to &(b) = I(e,) for a < €0 
we conclude: 
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Lemma. (IIZ—CA)tI(y) for each y < €,,. 


Curiously, though, there is still an instance of the scheme TI(é€o) which is 
not provable in (IIZ— CA). This is a special case of a general theorem of 
Kreisel which applies to any finite extension of Z? and which will be 
discussed in 8.2. Of course in the theory (II? — CA)+ (BI) we have TI(a) as 
soon as we have derived I(a). But there are also certain intermediate 
theories with this ‘‘balanced”’ property. 


6.4. Iterated comprehension principles; ramified analysis 


6.4.1. Jump operator hierarchies and iterated (IT) — CA) 

The recursion-theoretic jump operator embodies 3™ as a functional 
F: PA(N)— ACN). In general, given such a functional F, a well-ordering 
relation < in N and arbitrary X» we define X = ((X),) (n in the field of <) 
which iterates F along <; this is given by transfinite recursion as follows: 

(i) (X)o = Xo, 

(ii) (X)a@i = F(X)a, 

(iii) for limit a: x €(X). & (x)o E (Xan, and (x): < a. 

X is called a hierarchy for F along < starting with Xv. In (i}H(iii) we assume 
0 is the least element under < and a@1 is the successor of a in <. 
Suppose now that < and Aa.(a @ 1) are recursive and that the set of limit 
elements is recursive. Given a formal definition of F(X)= Y, write 
Hier (X, X,) to express that X satisfies (i}(iii). If the definition of F may 
be given in second-order form so may the formula Hier2 (X, Xo). In 
particular, this formula is arithmetical when F is the jump operator J. By 
6.2 we can establish in (A;-—CA)+TI(<) the — statement 
VX. AX Hier: (X, Xo). This may also be considered as expressing the 
iteration of (II}— CA) along <. We write (II}— CA), for this statement 
along <, (when given a natural well-ordering of order-type a), and 
(II?-— CA)... for the set of all statements (II{]- CA), with B <a@ (using 
initial segments of a fixed <,). Since (Ai- CA)! TI(B) for each B < eo by 
6.3.2, we conclude: 


THeorem. (A}— CA) 2 (II!— CA)..,. 


6.4.2. Hyperjump hierarchy and iterated (IT; ~ CA) 

We now apply the same ideas to hierarchies for the hyperjump operator J, 
which embodies 3 |*. Write (IIi-CA). for VX)4X Hier4,(X, X,) and 
(li — CA)... for the set of all statements (Ili - CA), for B < a. While we 
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cannot apply the second-order statement of transfinite recursion 6.2 
directly to this situation, it may be directly adapted to prove the following. 


THEeoreEM. (A}— CA) 2 (Ii — CA) <.y. 


It is shown in FRIEDMAN [1970a] that the theorems of this and the 
preceding section are best possible; there are also analogous results for 
iterated (II?- CA) with n >1. We shall discuss the results for n = 0,1 in 
Section 8. 


6.4.3. Ramified analytic and hyperarithmetic sets 

The basic step of predicative definition consists in passing from a 
collection ¥ of subsets of N to the collection P* of all subsets definable in 
the w-model Y from parameters in Y. The iteration of this procedure 
through all set-theoretic ordinals gives: 

(i) Ao = (h)* = the arithmetical sets, 

(ii) Pasi = PF, 

(iii) A, = Use, P, for limit A. 
This is a second-order form of Gédel’s notion of constructibility called the 
ramified analytic sets. It is known that Ap (where 2 is the least uncount- 
able ordinal) is a model of full second-order (DC). We are here concerned 
only with P, for a <a, (the least non-recursive ordinal). Y., consists 
exactly of the hyperarithmetic sets as shown by KLEENE [1959a], 


P,, = HYP. (1) 


Recall that HYP consists of the sets recursive in some H, for a€ 0; 
further, H, = (X), where Hier. (X,N) and < is the predecessor relation 
along a path in © which includes a; Spector showed that H, and H, are 


Turing equivalent when |a|=|b|. Let #. consist of the sets recursive in 
some H, for |a|< a. In more detail, Kleene showed that 
P, = H aire) for a< Wy. (2) 


Thus the ramified hierarchy up to w, gives another form of iterating 4”, so 
to speak w steps at a time. When wa = a we have Y, = X,. 


6.4.4. Ramified analysis and type theory 

Given any ordinal a we can form a system of ramified analysis RA. 
which uses numerical variables and set variables X®, Y®,Z,... of 
degree B for each B < a. The intended interpretation is that the variables 
of degree B range over Pg. The axioms are those of Z plus full induction on 
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N in the language of RA,, together with the ramified comprehension 
axioms: 


(RCA). AX’? Vn([ne xX”—a¢] for each B<a, 


where @ has each free second-order variable of degree = B and each 
bound second-order variable of degree < B. In addition to the preceding, 
when @ > w we must have an infinitary rule of generalization for each limit 
A <a, of the form 


(G), Infer VX o(X) from VXd(X™) for each B < A. 


When a is the type of a recursive well-ordering the rule (G) may be par- 
tially expressed in finite form. RA, is a form of (II2 — CA) or (II? - CA)<.. 
In general, if wa = a, RA, isa form of (I? — CA)<a. It is shown in SCHUTTE 
[1960] that RA, proves TI(«‘(0)) for each B < @ when w* = a. This will 
be discussed further in 8.2. The idea for ramified second-order theories 
may be extended to finite and transfinite type theory and set theory; for 
some formal systems of this kind, cf. ScHUtTE [1960] and Tarr [1968]. 


6.4.5. Ramified progressions and predicativity 

The idea of predicativity taken here is: that part of mathematical 
thought which is implicit in our conception of the natural numbers. 
Predicative definitions may involve quantification over sets and functions 
only restricted to collections specified by previously recognized classes of 
predicative definitions. Formally this is incorporated in ramified progres- 
sions of theories (RA.).. However, the general notion of well-ordering or 
ordinal is itself prima facie impredicative, so there must be some restriction 
on a. Kreisel had proposed the characterization in terms of autonomous 
progressions, where only those ordinals a are taken for which one has 
proved the existence of a representing well-ordering <, of type a ina 
system RA, with B < a. Schiitte and Feferman independently established 
I, to be the least impredicative ordinal under this proposal. Cf. FEFERMAN 
{1977} for the most recent discussion of work on this, including equivalent 
unramified systems of analysis. 


7. Recursion-theoretic models of finite type 


7.1. Introduction 


We are interested here in models of various of the theories considered in 
Section 4, principally to get straightforward independence results, but also 
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to get some conservative extension results by formalizing their construction. 
The structures Yt we deal with are specified by 

(i) a collection of sets M,, considered as the range of the type 7 
variables (for each 7 in the language) where in all cases My = N, 

(ii) application operations App... : Me X M.-..— M, for each a, 7, and 

(iii) interpretations of the constants. 

The entire structure is indicated by Yt=((M,),..). We write fa for 
App(f, a). 

There are two notions of recursion in finite type which may be directly 
adapted to form models of theories considered in Section 4. The first is 
HEO, the hereditarily extensional (effective) operations (due to Kreisel) 
and the second is generated by Kleene’s schemata S1-S9. The first is 
generalized to operations over any enumerative structure in 7.2; the second 
is only briefly described in 7.3. In the applications to the second-order 
content of the theories studied, one must have good information about the 
1-section M,, of It. This is immediate from the set-up in 7.2 but takes more 
work in the case of S1-S9 and its relativizations. However, it is more 
natural to consider the latter when one takes some finite type closure 
conditions as the starting point for a model. In both cases the models 
considered in this chapter are intermediate between the minimal structures 
for given closure conditions and the maximal structures discussed in 2.6. 


7,2. Hereditary operations over an enumerative system 


7.2.1. A suitable abstract notion of enumerative system © generalizing the 
situation of ordinary recursion theory is presented in FRIEDMAN [1971], 
modifying previous notions of Wagner and Strong. The data for such © are 

(i) a set A with at least two elements (the domain of ©), 

(ii) collections ¥, and ¥, of unary and binary partial functions, 
respectively, of arguments in A to A, 

(iii) total pairing and projection functions P © ¥,, P,, P,2€ A, and 

(iv) an enumerating function f€ ¥, for ¥,; ie., ¥%, consists of all 
functions Ax. f(a, x) for a E A. 
© is further required to satisfy: 

(v) ¥,, ¥, are closed under composition, 

(vi) ¥, contains the identity function and each constant function, and 

(vii) ¥, contains each function DC,., = Ax, y (a if x = y; b otherwise) 
for definition by cases. 
We write {a}(x) or ax for f(a,x) when f is the enumerating function 
specified in (iv). © is said to be a system over Nif A =N and the functions 
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Sc and Pd= Ax.(x ~1) are in ¥,. Only such © are considered in the 
following. It is easily shown that there is an Sj-function and that the 
recursion theorem holds in ©. The members of ¥, may be referred to as the 
©-partial recursive functions, and the total members as the ©-recursive 
functions (of one argument). 


7.2.2. Hereditarily recursive and extensional operations over © 

Suppose given an enumerative structure over N. The hereditarily recur- 
sive objects of type + over ©, HRO,(€) are defined as follows: 

(i} HROo =N, and 

(ii) HRO,..,= {a |Wx (x E HRO, > {a}(x) © HRO,)}. 
We write HRO(G) for (HRO, (€)).....). The hereditarily extensional effec - 
tive objects of type + over © are defined by interpreting E, of 4.4.2 in 
HRO(®). Spelling this out again, HEO, and a=,b are defined by 
induction: 


(i) HEOQ,=N and a=,bGa=b, 
(ii) a GC HEO,_., & Vx [x © HEO, > {a}(x) © HEO,] and 
Vxy [x =oy > {a}(x) =, ta} (y)] 
a =¢+17b & Wx [x © HEO, => {a}(x) =, {b}(x)]. 


The structure ((HEO,(€)).....) is denoted HEO(@&). One easily interprets 
the K, S functionals in HRO and (using the recursion theorem) the 
recursion operators R of all types. These may be verified to be extensional, 
so one has: 


TuHeEoreM. If © is any enumerative system, then HRO(€) is a model of Z” 
and HEO(G) is a model of Z°® + (Ext). 


7.3. Special cases 


7.3.1. © given by ordinary recursion theory 

Let ©, be the enumerative system of ordinary recursion theory. In this 
case it is known that HEO satisfies (QF — AC). This is remarked in KREISEL 
[1959a]; extending the result of Kreisel, Lacombe, Shoenfield for type 2 
effective operations, one shows that every member of HEO is a hereditarily 
continuous (or countable) functional. The axiom of choice holds for the 
latter as proved by KLEENE [1959b]. The definition of HEO(€,) can be 
formalized in arithmetic, as well as the proof that it satisfies (AF — AC). 
Hence we obtain the following improvement of the theorem in 4.4.2. 


THEOREM. Z* + (Ext) +(QF— AC) is a conservative extension of Z. 
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7.3.2. Ili recursion theory 

One way to get a model for (3") is to take ©2 to be an enumerative 
system for the partial IT; functions (obtained by uniformizing Hj relations). 
The ©,-recursive functions are thus just the hyperarithmetic functions and 
the elements of HRO(€2) might be called the hereditarily hyperarithmetic 
operations. HARRISON [1968] showed that HYP is a model of 2;}— DC 
(extending KreIseL [1962]). It is reasonable to conjecture that HEO(€,) 
satisfies full (QF— AC). We have, at any rate: 


THEOREM. HEO(G@,) is a model of Z° + (Ext) + (3%), in which the 1-section 
is HYP and hence a model of 2}-DC. 


Note that formalizing the argument in this case does not lead to an 
obvious conservation result, since we need (BI) as well as (Aj — CA) to treat 
the properties of HYP. In this respect the proof-theoretical methods of 
Section 8 will be superior. 


7.4. Recursive functionals of finite type 


In KEENE [1959c] partial recursive functionals are defined over the 
maximal type structure. Fixing one argument to be a given functional F, 
one gets a notion of being partial recursive in F. For any finite type 7 over 0, 
this leads to Sec, (F) = the set of objects of type 7 which are (total) recursive 
in F. The structure Yt = Rec(F) = ((Sec, (F)).....) satisfies Z*. By choosing F 
to correspond to given closure conditions, we may use Yt to give models of 
theories embodying these closure conditions. For example, this holds with 
F=3"% and F=4 |‘. The following indicates some known relevant 
information about Rec(F); cf. also Chapter C.6. 

KEENE [1959c] showed that Sec,(3")= HYP. We know by Kleene’s 
basis theorem that Rec(3 |) is a model of (BI). By Gandy’s Selection 
Theorem (Ganpy [1967]), Rec(F) is a model of (Zi — DC) for any F of type 
2 in which 3™ is recursive. For such F, SHOENFIELD [1968] has given a 
hierarchy (H‘).ce* for Sec,(F) which generalizes the hyperarithmetic 
hierarchy; this has length w{f=the least ordinal not (definable by a 
well-ordering) recursive in F. We denote by #% the set of functions 
recursive in some H£ of Shoenfield’s hierarchy with |a|<a. Thus #% 
coincides with the hyperarithmetical hierarchy for the ordinary jump 
operator J and #2: is the hyperjump hierarchy up to a. 
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7.5. Independence results using REC 


Take REC to be the set of recursive functions from N to N. Studies in 
recursive mathematics have led to many counterexamples to statements of 
recursive analogues of classical theorems. These give independence results 
for such theorems from any finite type theory T which has a model 
I = ((M,),..) with M, = REC. In particular, T = Z° + (Ext) + (QF- AC) 
is such a theory by 7.3.1. 

For examples of such statements one may refer to 5.8, 5.9, 5.14 and 8.2 in 
Chapter D.5. In addition to those mentioned there, we may note two 
results of Z” +(4") which were discussed in Section 3 above, namely 
Konig’s Lemma (KL) and Ramsey’s Theorem (RT). It is familiar that KL is 
false in REC, using an infinite recursive binary tree with no branches (of 
length w) in REC. Since KL implies RT, this is improved by the result of 
SPECKER [1971] showing Ramsey’s Theorem also to be false in REC. 


7.6. Independence results using HYP 


There is also much detailed information about HYP which is useful in 
obtaining independence results from any theory which has a model 
= ((M,),...) with M,=HYP. As we have seen, Z* + (Ext)+(3%)+ 
(21- DC) is such a theory. 


7.6.1. Some facts about HYP 

KLEENE [1959a] showed that every predicate w(n) which is in 2|-form 
relativized to HYP, i.e., of the form w(n)< Sfuye }(n, f) with ¢ arithmeti- 
cal, is in Il;. Spector proved the converse. We state the pair of results as 


err = 11, (1) 


Of course then (II,)""” = |. GANby [1960] gave a new proof of Spector’s 
result; the main step was to show: if <p» is a recursive linear ordering which 
is well-founded with respect to all HYP descending sequences but not 
well-founded, then the longest initial well-ordered segment of <p is of 
order type w;. Hence there exist Il; well-orderings of order type w,. This 
was proved independently in FEFERMAN and Spector [1962], where an 
extension 0* of © was defined, partially ordered by a recursively enumer- 
able < behaving locally like <.; b € O* holds by definition iff the set of 
< -predecessors of b is well-founded with respect to HYP. By (1), O* € 31, 
so O*—© is nonempty. Taking any a@ O*- ©, it was shown that 
{x | x <a} is a Tl;-path through ©. Further, © and N— ©* give an example 
of sets inseparable by any set in HYP; a fortiori, O* € HYP. 
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7.6.2. (f1;- CA) and (BI) are false in HYP. 
Since the IT; definition of © relativized to HYP gives exactly 0*, the 
following is an immediate conclusion. 


THEOREM. (II}— CA) without parameters is false in HYP. 
The next was proved by Kreisel. 
THEOREM. (2}— BI<) is false in HYP for a primitive recursive <. 


Proor. Choose a linear primitive recursive <p» which is well-founded with 
regard to HYP but not well-founded. Consider the predicate 
W(n) ove [f0<pn— dm (fm' Kpfm)] which expresses that {m | m <pn} 
is well-founded. By Spector’s theorem #(n) is equivalent to dguye &(n, g) 
with @ arithmetical. Let &,(n) be 3g ¢(n, g), so b(n) (W.(n))""". Since 
Vn [Vm <pnii(m)—> &(n)] is true we have Vn [Vm <pnvii(m)— (n)] 
true in HYP. But HYPEVWny.(n) would imply Vniy(n); a 
contradiction. [1 


7.6.3. Some mathematical statements which are false in HYP 
THEOREM. The l.u.b. principle is false in HYP. 


Proor. Take an arithmetical @ s.t. Afiave }(n, f) is (in I} but) not in HYP. 
Using the representation of reals, by Dedekind sections it may be shown 
that UX[@(X), X Dedekind, X € HYP] is also not in HYP. O 


THEOREM. CWO is false in HYP. 


ProoF (Kreisel). The idea is to use two HYP inseparable predicates 
Vf Ax P.(f(x),n), Vf ax P.(f(x),n); the union of their complements Aj, 
A, is N. Aj, Az may be written in the form Wfyyre dx O.(f(x), 1), 
Vfuve ax Q2(f (x), n) respectively, with Q,, Q. recursive. Now if compar- 
ability of well-orderings held in HYP we could form a reduction of these 
complements, i.e., A, U A2 = A; U A2 where Aj}, A} are disjoint and in I; 
form relative to HYP. But then Aj, A}€ 21, so A; © HYP, contradicting 
the hypothesis. © 


7.6.4. By the representation (3.1) of open sets in R as unions of sequences 
of rational intervals, and of closed sets as complements of open sets, one 
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can formulate various statements concerning these in second-order terms 
and test them in HYP. The theorem of Cantor—Bendixson (C-B) states that 
every closed set F has a perfect kernel F* C F and that F — F* is countable. 
The following was proved in KREIsEL [1959b]. 


THEOREM. The C-B theorem is false in HYP. 


The proof loc. cit. gives a recursively described closed F such that 
F — F* consists of HYP reals of arbitrarily large degree, hence which 
cannot be enumerated by a HYP function. Further, F* 0 HYP is empty. 
Thus, in this model, it is not even true that if a closed set contains no 
(non-empty) perfect subset, then it is countable. 

The following is an example from algebra with related features (FEFER- 
MAN [1975], after an example by Barwise). 


THEOREM. The statement that every Abelian group contains a largest 
divisible subgroup is false in HYP. 


For this one defines a recursively presented Abelian (p-group) G such 
that UH[H CG, H divisible and H € HYP] is not in HYP. 


7.6.5. Third-order statements 

Various statements about sets of reals do not have second-order 
translations, but relate directly to which reals exist. These may be tested in 
structures with M, = HYP in which M, has a simple description. 


THEOREM. The following statements may be falsified in suitable Yt with 
M,=HYP. 
(i) Every set of reals which is closed under limits is closed. 
(ii) Every open covering of [0,1] has a finite subcover. 
(ili) There exist analytic non-Borel sets. 
(iv) There exist Lebesgue non-measurable sets. 
(v) Every set of reals has an outer measure. 


The same Wt suffices for (i}(iv) but of course not (v). 


7.7, Some special w-models for second-order systems 


Next are examples of w-models which do not appear as the 1-sections 
of the kind of finite type structures considered above. Each is specified by a 
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subclass of N“ or A(N). The simplest is 7%. = the collection of arithmetic 
sets, which forms a model of (II2— CA). Thus the existence of H., is not 
provable in (IT2 — CA),-though it follows from (Aj — CA). Consider analog- 
ously 


0, = Rec(0) (1) 
and 
6, =the collection of sets recursive in some 0,. (2) 


For any scheme S, (S) indicates the corresponding scheme without 
second-order parameters. The following were observed by Kreisel. 


THeEorEM. (i) ©, satisfies (I1}- CA)” but not full (M1}- CA). 
(ii) ©. satisfies (T1;— CA) but not (A;— CA). 
(iii) ©, is a model of (BI.) for primitive recursive <. 
(iv) ©, is a model of (BI). 


Proor. (i) The positive part is by the Kleene basis theorem. For the 
negative part argue indirectly: if (I]i - CA) were satisfied, then ©, would be 
in 0). 

(ii) The positive part uses the basis theorem relativized; for the negative 
result, note that (O,),<. is the unique solution of a %3-predicate. 

(iii) If < is primitive recursive and has no descending sequences in 6, 
then it is well-founded and hence TI holds over < with any predicate. 

(iv) By relativization of (iii). O 


Note. The models given are not minimal, since it was shown by Gandy, 
Kreisel and Tait that the intersection of all w-models of a recursive (or 
even II}) theory is contained in HYP. 


7.8. (Ai— CA), (2!- AC), (2!- DO) 


Kreisev [1962] proved that HYP is the smallest model of (A; — CA), so 
it cannot be used to distinguish among these three systems. FRIEDMAN 
[1967, 1970a] showed that (%i— DC) is a conservative extension of (IIi— 
CA).,, and hence also of (Ai — CA), for II3-sentences; this will be described 
in 8.6. He also showed that (i— DC) is a proper extension of (2;— AC). 
STEEL [1974] announced that (2; — AC) is a proper extension of (A; — CA); 
it is open whether (Zi— AC) is derivable from (Aj — CA). 
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&. Second-order content of the theories studied; proof-theoretical methods 


8.1. Introduction 


The general problem here is to give precise information about the 
second-order theorems T’ of a finite type theory T°, and in particular to 
describe a basis for the =} theorems of T’, i.e., a class of functions M, such 
that if T?+3fd(f) with @ arithmetical, then (Af € M,)d(f). By direct 
proof-theoretical methods for this problem we mean extensions of Gent- 
zen’s cut-elimination theorems for calculi of sequents (or related normal- 
ization theorems for calculi of natural deduction) to infinitary languages 
following Lorenzen, Schiitte, and Tait. Some of the systems considered 
above can be translated directly into such languages, and the cut- 
elimination theorem can be used to obtain information about provable 
well-orderings and &} theorems. A few results obtained in this way are 
mentioned in 8.2. By indirect methods a considerably wider class of 
theories can be treated. The principal ones here make use of passage 
through formally intuitionistic theories (8.4) followed by Gédel’s functional 
interpretation (8.5) which associates a quantifier-free theory of functionals 
Ti with given T°. Finally, the terms of Ti’ may be analyzed by the method 
of normalization; some basic material for that is given in 8.3. The 
applications are dealt with in the remainder of Section 8. 


8.2. Direct proof-theoretical methods 


8.2.1. Infinitary cut-elimination 

Since these methods are described in some detail by ScHUTTE [1960], 
Tait [1968], Takeuti [1975] and in Chapter D.2, only the relevant 
conclusions are mentioned. The use of infinitely long formulas and deriva- 
tions is stressed because these are simplifying and make clear the relevance 
of ordinals. Each such formula @ (or derivation d) is a well-founded tree 
and hence has an ordinal length denoted by | ¢| (by |d|). Using a system 
for deriving (finite) disjunctions of formulas [ =(d,v---v ¢,), the cut- 
rule may be put in the form 


(Cs) vd ae 


d is cut-free if no application of this rule occurs in d. The cut-rank p(d) of 
a derivation d is defined by 
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sup{|@|{+1: (Cs) occurs in d} if d contains some cut, 
p(d) = (1) 


0 if d is cut-free. 


The essential feature of the rules of infinitary predicate calculus used other 
than (C) is the subformula property: each formula in a hypothesis of the rule 
is a subformula of some formula in the conclusion. This permits establish- 
ing certain properties of cut-free derivations by induction on their length. 
The main result then is that every derivation d can be transformed into a 
cut-free derivation d*. This is proved by a direct extension of Gentzen’s 
argument for first-order predicate calculus. The following form in Tait 
[1968] gives a useful measure of the increase in length which d* may need 
compared to the length of d. 


THEOREM. If dt @ with |d|<=B and p(d)< y+’, then we can find a 
derivation d* of @ with |d*|<«(B) and p(d*)s y. 


In particular, if d has finite cut-rank n=w°+n, then the n-fold 
repetition of this result gives cut-free d* of the same conclusion with 
|d*|<e(B) where e(8)=e,.,;=the least e number beyond £ 
(€, = B < &41). 


8.2.2. The ubiquity of ¢ 

Let Z’ be Z in a language with additional sorts of variables and full 
induction for N, but no further new axioms. Z’ is translated into an 
infinitary language, @ & #* by taking (An ¢(n))* = A, b(n)”. Each deriva- 
tion d+ @ of Z’ may be transformed into a derivation d*+@* with 
|d*|<w-+2 and p(d*)<o. Then any such d transforms to a cut-free 
derivation d* of length <eé». For formulas ¢@ of any given bounded 
complexity m we can express a truth definition Tr,, in Z’. Then by the 
subformula property and TI(€.) we can show that every formula of d* is 
true if d proves a formula of complexity =< m. This allows one to establish 
the so-called reflection principle for Z': 


Z' + Tl(€o) + Prz('b') > @, (1) 


where Prz.('') expresses that Z'} @. In particular, if we take @ = “Yy, it 
follows that 


Z' a wo + Ti(€0) F Conyz+y) is (2) 


Hence by Gédel’s Second Incompleteness Theorem there must be an 
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instance of TI{€o) which is not provable in Z’+ w& if it is consistent. The 
following strengthening of this result is given in KREISEL and LEvy [1968]. 


THEOREM. Suppose S is a consistent extension of Z' by axioms of some 
bounded complexity ; then there is an instance of TI(€0) which is not provable 
in S. 


In particular, if Z’ is arithmetic in a second-order or finite type language, 
every consistent bounded complexity extension of Z’ has an instance of 
Tl(eo) not provable in it. This applies to such theories as (IIZ— (CA), 
(21— DC), (f1;- CA), (23;- DC), Z” + (3%) + (QF- AC), etc. It does not 
apply to (BI) or (BI.) (<_ primitive recursive). 


8.2.3. Ordinal bounds for arithmetical and ramified analysis 

To move up to (II2—CA), replace the quantifiers VX #(X) by 
Avex &(w) where I is the class of arithmetical formulas. Then any finite 
derivation d of ¢ in (II2 - CA) is transformed directly into a derivation d* 
of @* where |d*{|< +2 and p(d*)< w-2. Applying the theorem of 8.2.1 
first to reduce cut-rank from w + n to w, and then from w to 0 one obtains 
cut-free d* with |d*|< e,,. 


THEOREM. (II2— CA) ¥ I(e.,). 


Ordinal bounds for the ramified systems RA, may be obtained similarly. 
The simplest description is when @ is an e-number. 


THEorEM. If w* = a, then RA, ¥ I(«(0)). 


From this one obtains that induction up to I (the least solution of 
«‘°(0)= a) is not provable in an autonomous progression of ramified 
theories (cf. 6.4.5). 


8.3. Normalization of infinite terms 


8.3.1. The terms and reduction procedures 

We here follow Tarr [1965] in considering terms built up from variables 
of finite type and 0 by successor, application, abstraction (Aa.t) and 
sequencing (<t,),). These may be treated similarly to infinite derivations, 
and have certain technical advantages compared to the use of combinators 
and recursion operators; the latter are replaced by infinite terms as follows 
(for each appropriate combination of types): 
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Rv R*= Afra (tr) n, where to=@, tas: = fOH,. (1) 


Again, each term f¢ is a well-founded tree and has a length |t|. Replacing 
the combinators K and S by Aa@Ab.a and AfAgaAa. fa(ga), respectively, 
each ordinary term ¢ is sent into an infinitary A-term ¢* with |t*|< +2. In 
general, the following simplifications or immediate reductions may be 
applied to infinitary terms (reading ‘4’ as ‘reduces to’): 


(a1) (Aa.t[a])s 4 t[s], 
where t[s] = subst(s/a)t. 

(42) (tan O? Sl tn. 
(A) (teats ACt.S)P. 


The last is applied only when (¢,),r is not of type 0 and r is not a numeral 
0. Consider the least transitive and reflexive relation 4 which contains 4; 
(i = 1,2, 3) and preserves application. If t 4s, then ¢ and s define the same 
object at any common assignment to their free variables. We say that ¢ is 
irreducible or normal if t= s implies t =s, i.e., if ¢ has no immediately 
reducible subterms. Such subterms have a cut-complexity analogous to that 
for the cut-rule; e.g., the complexity of (Aat[a])s is lev(type(a)). Then p(t) 
is defined as the least ordinal greater than all such complexities in #. Note 
that p(t*)< for t an ordinary term. In analogy to the cut-elimination 
theorem of 6.2.1, Tait [1965] obtained a normalization theorem, of which 
we need only the following special result. 


THEOREM. Suppose p(t) < w and |t|< €o; then we can find t* in n.f. with 
tar* and |t*|< eo. 


The proof is similar to that for derivations. 


8.3.2. Normal terms of type 0 

Every term t of type 1 defines the same function as An. tn. Thus for a 
study of such it is sufficient to consider normal terms of type 0. If ¢ is of 
type 0 and normal and contains only numerical free variables, then either 

(i) t=0 or 

(ii) t= s' where s is normal or 

(iii) tf is a variable of type 0 or 

(iv) t = (t,),5 where each ¢, and s is normal of type 0 but s is not a 
numeral. 
It follows that the only closed terms of type 0 are numerals. Suppose now 
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that ¢ is normal and contains only variables of type 0, 1,2 free. Then either 
(i)-(iv) or 
(v) t = f'(s) where f is a type | variable, and s is normal of type 0, or 
(vi) t = f?(An.s) where f? is a type 2 variable and s is normal of type 0, 
or finally 
(vii) ¢ = f?({t,).) where each ¢, is normal of type 0. 
Thus again we have a complete generation of the normal terms of type 0. 


8.3.3. The 1-section of (relative) primitive recursion 

Let PR be Gédel’s primitive recursive functionals of finite type, i.e., 
those defined by terms generated from 0, Sc, using all K, S, and R’s. For F 
of type 2, PR(F) denotes the class generated using F as well. Let REC<,, 
consist of the functions of type 1 defined by ordinary primitive recursion 
and transfinite recursion on any <, for a < €9. The following is in Tair 
{1965}. 


THEOREM. 1-sec(PR) = REC..,,. 


Proor. By 8.3.1 each element of 1-sec(PR) is denoted by an effectively 
given infinite A-term of the form An.t where ¢ is normal of type 0 with at 
most ‘n’ free, and |t|< e . Then the value of ¢ at n, Val(t, n) is defined by 
recursion on the length of « O 


Next, using Shoenfield’s hierarchy Hé and the classes #7 of functions 
recursive in some H¢ with |a|< eo, we obtain analogously (FEFERMAN 
(1971): 


THEOREM. 1-sec(PR(F)) = #%.. 


Proor. Each element of 1-sec(PR(F)) is denoted by some effectively given 
term An.t where ¢ contains at most n, f? free where f* denotes F, t¢ is 
normal of type 0, and |t| < eo. Following the build-up of 8.3.2 we see that 
the function defined by An.t is recursive in H{ where |a|=|t|. O 


Coro.iary. 1-sec(PR(u )) = 1-sec(PR(3")) = #.,. 


8.3.4. The 1-section of relative elementary primitive recursion 
PR(F) is described in the same way as PR(F) except we use the 
operators R instead of R. 
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A 
THEOREM. Suppose 3% is recursive in F; then 1-sec(PR(F)) = #6. 
PP 


Proor (FEFERMAN [1971]). The relation Rfanb =m_ is arithmetically 
definable in f, a, n, b, m since we need only talk about finite sequences of 
type 0 objects. Thus, every member of PROF) is definable by a finite A-term 
without sequencing but with + and -, and with at most f? free. Normaliza- 
tion of these takes place in the finite A-calculus. O 


A A 
CorOoLiarRy. 1-sec(PR(p )) = 1-sec(PR(3")) = the arithmetic functions. 


8.4. Passage to formally intuitionistic systems 


8.4.1. The negative translation 

With each classical theory T considered below we associate a formally 
intuitionistic theory (T)' by dropping the law of excluded middle (LEM) 
from its basic logic (since it is generally easier to analyze the explicit 
content of (T)'). LEM is retained for atomic formulas. We emphasize the 
adjective ‘‘formally’’, since the systems which contain functionals such as 
3” or 3 J, etc. do not express intuitionistic principles. In 1933 Gédel gave 
a simple translation of classicat number theory Z into intuitionistic number 
theory HA (‘‘Heyting’s Arithmetic’’) which extends directly to a translation 
of T into (T)' for many of the T considered here. This sends each ¢ into a 
formula (¢@)' where (__ )' preserves atomic formulas and the operations a, 
— and V while 


(ov py = 2(7(¢)' aT)), (1) 
(dad) = Va (dy. (2) 


Equivalently, one can take (¢ v w)' tobe 1 4[(¢)' v ()'] and (Ja¢)' to 
be ——3a(q)'. Further details on this so-called negative (or double- 
negation) translation may be found in Chapter D.5, 3.8-3.10. 


THEOREM. If T is an (v,A)-free extension of Z°, ie., by (possibly) 
additional constants and by axioms which do not contain disjunctions or 
existential quantifiers, then T is translated into (T)’. 


The system (Z*)' is denoted HA”. 
8.4.2. Extensions by some selection operators 


The theorem of 8.4.1 can be applied to Z* + (4%) by slight modification 
of the additional axiom. However, for purposes below we apply it instead 
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to the classically equivalent system Z° +(y) where the unbounded 
minimum operator yu (of type 2) is adjoined with the axiom 


(H) fn =0—> f(uf) =Or pf <n. 


Note that the definition: A‘f = [0 if f(uf)=0, 1 otherwise] works with 
intuitionistic logic as well so HA” +(p) is equivalent to HA® + (3%). 
Z° + (w) is translated directly into HA” + (1). The latter system is certainly 
not intuitionistic (constructive) since we can prove Jn(fn=0)v 
adn (fn = 0) from f(uf)=0v f(uf) 40; more generally LEM holds for 
any arithmetical formula. 

Similarly, for purposes below with Z°+(4 |“) we pass through Z°+(L) 
where L is the leftmost branch selection operator having the (v,3)-free 
axiom: 


(L) Vnf(&(n)) = 0 Vn f(Lf(n)) = 00 Lf0 < g0. 


CorROLLary. (i) Z° + (2) is reduced to HA® + (w) by the negative transla- 


tion. 
(ii) Z° + (L) is reduced to HA® +(L) by the negative translation. 


8.5. Godel’s functional intepretation and its applications to Z” 


8.5.1. Form of the interpretation 
This was given originally by Gopet [1958] for HA but applied im- 
mediately to HA®. For full details cf. Section 11 of Chapter D.5; we need to 
know the following only. With each formula @ of the language of HA® is 
associated a formula }” (‘D’ for ‘Dialectica’) with the same free variables 
as o. 
° = Ja Vb dv(a, b) (1) 


where a,b are (possibly empty) sequences of variables of various types 
determined by ¢; further, @p is quantifier-free. Special cases in the 
inductive definition of @” are 


(Ved)? = Af Vc, b bn(fe, b), (2) 
(7)? = 3g Va 7 dv(a, ga). (3) 


In particular, if the list b is empty, (7 6)” = Va  dp(a) and (4 @¢)” is 
da 7" $p(a) which is equivalent to da@p(a) since atomic formulae are 
decided. 


Coro tary. (i) If @ is existential, then (')° is equivalent to ¢. 
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(ii) If @ = Va 3b 6(a, b) with 6 quantifier-free, then (')° is equivalent to 
Af Va6(a, fa). 
(iii) (QF — AC)')” is logically valid. 


8.5.2. Interpretation of HA® and Z* 

Let QF —- HA” be the quantifier-free part of HA®; this may be axioma- 
tized in a straightforward way as the theory of primitive recursive 
functionals. Thus we denote it by PR® (usually referred to as ‘“‘Gédel’s 
T’’). Gédel’s argument shows (GODEL [1958]): 


THeoreM. If HA® | fd and d° = Ja Vb dp(a, b), then for some sequence t of 
terms PR® | do(t, b). 


Coroiiary t. If Z°+(QF-AC)t+Va 4b6(a,b) with 6 quantifier-free, 
then PR® | 60(a, ta) for some term t. 


Each term t denotes a member of PR; by 8.3.3 one obtains the following 
result due to Kreisel (cf. Chapter D.2). 


CoroLiary 2. The provably recursive functions of Z* form exactly the class 
REC..,,. 


We shall obtain analogous results for theories with selection functionals. 


8.6. Non-constructive extensions of the functional interpretation 


8.6.1. General statement 
The following is obtained by applying the negative translation followed 
by the Dialectica interpretation. 


THEOREM. Suppose w& does not contain v or quantifiers but (possibly) 
additional constants. Then Z° + + (QF — AC) is Dialectica -interpreted in 
PR° + ws 


8.6.2. Application to adjunction of (4%) and (w) 


As a corollary to the preceding theorem and 8.5 we have: 


THeoreM. If Z* +(u)+(QF-—AC)}+ Va 4b 6(a,b) with 6 quantifier-free, 
then PR® + (w)t 6(a, tla}) for some term t of PR(w). 
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THeoreo. (i) If Z° + (w)+(QF— AC)+Jf' 6(f) with @ arithmetical, then 
O(f) is true for some f in #.,. 

(ii) Z” + (2) + (QF — AC) is a conservative extension of (TI?- CA)<., for 
Il} sentences. 


ProorF. (i) @ is equivalent to a quantifier-free formula using (yp). One then 
applies the preceding theorem and the result of 8.3.3 that 1-sec(PR()) = 
Hey 

(ii) Consider a IT} sentence Vf 3g 6(f, g) with @ arithmetical, provable in 
Z°’ + (uw) +(AF-—AC). There is a term t[{f] in PR(w) with 6(f,¢[f]) 
provable in PR” + (1). By 8.3.3 we have t[f] € #£, uniformly in f, so for 
some a <€o, t{f]€ #4 uniformly in f. For |a|=a@ and variable f the 
existence of H{,can be proved in (II? — CA)<.,. The argument is completed 
by formalization. O 


Since (21— DC) is contained in Z* + (3“)+ (QF— AC), we obtain: 


Corotiary. (i) (£1~- DC) is a conservative extension of (II!— CA) eq for 
IL3-sentences. 
(ii) « (0) is the sup of the provably recursive well-orderings of (X;— DC). 


This result is due to FRIEDMAN [1967]; the nature of his proof, which is 
also presented in FRIEDMAN [1970a], will be discussed in the next section. 
The present proof via the Dialectica interpretation (allowing improvement 
to finite type theories) was presented in FEFERMAN [1971]. 


Note. (i) tells us that (£:— DC) is reducible to predicative principles 
though it is prima-facie impredicative (and its least w-model HYP is 
certainly impredicative). 


8.6.3. Application to adjunction of (3 |“) and (L) 
We obtain by the same lines of argument: 


THEorEM. If Z° + (L)+(QF— AC)+Va 3b 6(a, b) with 0 quantifier -free, 
then PR® +(L)t @(a,t[a]) for some term t of PR(L). 


THEOREM. Z” +(L)+(QF-— AC) is a conservative extension of the theory 
(TI; — CA)... for T13 sentences. 


Note for the latter that #5, is just a form of the hyperjump hierarchy up 
to eo. This gives another result of FRIEDMAN [1970a]: 
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CoROLLARY. (%3— DC) is a conservative extension of (Ili— CA)<., for Ts 
sentences. 


Note. The idea of Friedman’s proof of 8.6.2 Corollary (i) is not easy to 
explain succinctly, and the following only indicates its pattern. Suppose 
is in >} and is consistent with (II?— CA) ,,: it is to be shown that w is 
consistent with ({!- DC). We know by 8.2.2 that there is an instance 
Tl(€o,@) of transfinite induction up to &€ 9 which is not provable in 
(IT! — CA)<.5 + w. One may try to model ({}— DC)+ & in (IT?— CA)<.,+ 
w+ —TI(e0, @) via the set of functions recursive in some H, with o(a). 
That does not quite work, since some portion of the scheme TI(€,) must 
still be used to verify the properties of the model. However, the proof can 
be carried through by certain modifications of this idea. It is evidently one 
of its features that non-standard models play a principal role; this is to be 
expected of any model-theoretic argument, since HYP is the least standard 
model of (2i— DC). By similar arguments, FRIEDMAN [1970a] obtained the 
corollary above for (%3— DC) and further such results for (£i— AC) when 
k >2. 

Another proof of the corollary in 8.6.2 above for (21 — DC) was given by 
Tarr [1968] using cut-elimination methods following those of 8.2; extension 
to (2:— DC) is dealt with in Tart [1970]. 

Still another proof of the corollary in 8.6.2 was given by Howarp [1968] 
using a constructive Dialectica interpretation with so-called bar recursion; | 
cf. Chapter D.5, Section 11. 


8.6.4. Conservative extensions of systems for absolute hierarchies 

Given a definable functional F on subsets of N, let Hier£(X) be the 
formula Hier£ (X,N) of 6.4.1 expressing that X is a hierarchy for F along 
< starting with X, = WN; such sets X may be called absolute hierarchies’ for 
F. Using the jump and hyperjump operators J and J,, write (IT? — CA), for 
AX Hier.(X) and (11;—CA), for 3X Hier2(X). Then, as before, 
(Ili — CA)z,, denotes the collection of statements (I; — CA), for all B <a. 
Note that (Ai.;- CA) D (Ili— CA)<., for i = 0,1. 

The conservation results of 8.6.2, 8.6.3 may be strengthened by replacing 
(Il; — CA).., by (Ili - CA)z,, but only at the cost of decreasing the class of 
statements conserved. 


THEOREM. (i) Z° +(u)+(QF-—AC) is a_ conservative extension of 
(TI? CA)z,, for %i sentences. 

(ii) Z° +(L)+(QF—- AC) is a conservative extension of (I1;- CA)z=., for 
X} sentences. 
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The proofs require no essentially new points besides those indicated 
above. For example, in the case of (i) we have seen that if Z° + (u)+ 
(QF — AC) 3b 6(b) with @ arithmetical, then for some a < e, and f E &,, 
6(f) is true. H, can be proved to exist in (II? ~ CA)z,,. The argument then 
proceeds by formalization of these observations. 


Coro tary. (i) (2|— DC) is a conservative extension of (II}— CA)z=,, (and 
hence of (Aj— CA)°) for %\ sentences. 

(ii) (23- DC) is a conservative extension of (Tl;— CA)z=,, (and hence of 
(A}— CA)>) for &} sentences. 


Part (i) of this corollary was obtained in KREIsEL [1975], Appendix 2; his 
argument adapts (and elaborates) that of FRIEDMAN [1970a]. 


Note. As Kreisel remarks (see KreEIseEt [1975]), if (II? - CA); + 1(B) where 
a <w,,,then B < &o. This follows from the general result of KREISEL [1968], 
p. 341 that if T is an extension of Z’ by true %; sentences, then B < €o 
whenever Tt I(B). 


8.7. Systems with restricted induction 


We shall use the method above to improve the result of 5.5.1. In 
verifying the Dialectica interpretation of restricted induction I” it is only 


A 


necessary to use the operator R. 


THeEorem. If restricted (Z“) + (u)+ (QF + AC) Wa 4b 0(a, b) where @ is 
quantifier-free, then PR“ + (w)+ O(a, t{a]) for some term t, where PR“ is 
the quantifier-free part of restricted (Z’). 


By 8.3.4, 1-sec(PR( )) = the arithmetic functions. Hence if the system 
considered proves a %; sentence, that sentence is true in the structure of 
arithmetic functions. Further, by formalizing the argument needed for any 


particular derivation from the system, one obtains: 


THEOREM. Restricted (Z‘)+(w)+(QF-—AC) is a conservative extension 
of Z. 


Corotiary. Restricted (2}— AC) is a conservative extension of Z. 


Note that the method does not work for restricted (2i-— DC), since to 
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derive that from higher-type (QF — AC) with (uw) we need the recursion 
operators R. 


Note. The above theorems and corollary improve conservation results over 
Z by (i) BARWISE and ScHLtPF [1975] and independently by FRIEDMAN [1975] 
induction and arithmetical comprehension. One can also obtain a result for 
restricted Z° + (L)+(QF-— AC) and hence restricted ({;— AC) over re- 
stricted (Ili - CA). 


9. Sources for further topics 


9.1. Methods and applications of proof theory 


(i) Basic papers (GENTZEN [1969]), 
(ii) texts (ScHUTTE [1960], TakEutTI [1975]), 
(iii) survey of methods and applications to systems of analysis (KREISEL 
[1968}), 
(iv) systems of natural deduction and normalization (PRawirz [1971)]). 


9.2. Systems of ordinal notations 


(i) Systems based on structures of ordinal functions (FEFERMAN [1968]), 
(ii) systems employing higher number classes after Bachmann (BRIDGE 
[1975], BucHHotz [1975]), 
(iii) ordinal diagrams (TAKEuTI [1975]). 
9.3. Predicativity 


Characterizations by Schtitte and Feferman in terms of autonomous 
ramified progressions; equivalent unramified theories (FEFERMAN| [1977]). 


9.4. Theories of generalized inductive definitions 

(i) One inductive definition (Howarp [1972]), 

(ii) iterated inductive definitions (FEFERMAN [1970], ZucKER [1973]). 
9.5. Formal theories of ordinals and their functional interpretations 


(i) A constructive theory of countable ordinals (Howarp [1972]), 
(ii) theories of higher number classes (ZucKER [1973]). 
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9.6. Continuous functional interpretations of analysis 


(i) Basic notions and varieties of interpretations (KREISEL [1959a]), 
(ii) bar-recursive functional interpretations (Spector [1962], Howarp 


[1968], LuckHarpDT [1973]). 


9.7. Functional interpretations with variable types (GiRARD [1971]). 


References 


Barwise, J. and J. SCHLIPF 
[1975] On recursively saturated models of arithmetic in: Model Theory and Algebra: A 
Memorial Tribute to Abraham Robinson, edited by D. Saracino and V.B. Weispfen- 
nig, Lecture Notes in Mathematics Vol. 498 (Springer, Berlin) pp. 42-55. 


BisHop, E. 
[1967] Foundations of Constructive Analysis (McGraw-Hill, New York). 


BRIDGE, J. 
[1975] A simplification of the Bachmann method for generating large countable ordinals, 
J. Symbolic Logic, 40, 171-185. 
BUCHHOLZ, W. 
[1975] Normalfunktionen und konstruktive Systeme von Ordinalzahlen, in: Proof Theory 
Symposion, Kiel 1974, edited by J. Diller and G.H. Miiller, Lecture Notes in 
Mathematics Vol. 500 (Springer, Berlin) pp. 4-25. 


CHURCH, A. 
[1940] A formulation of the simple theory of types, J. Symbolic Logic, 5, 56-68. 


FEFERMAN, S. 
{1968] Systems of predicative analysis, II: Representations of ordinals, J. Symbolic Logic, 


33, 193-219. 

(1970] Formal theories for transfinite iterations of generalized inductive definitions and 
some subsystems of analysis, in: Intuitionism and Proof Theory, edited by A. Kino, 
J. Myhill and R.E. Vesley (North-Holland, Amsterdam) pp. 303-326. 

{1971] Ordinals and functionals in proof theory, in: Proceedings of the International 
Congress of Mathematics, Nice, 1970, Vol. 1 (Gauthier-Villars, Paris) pp. 229-233. 

[1975] Impredicativity of the existence of the largest divisible subgroup of an Abelian 
p-group, in: Model Theory and Algebra: A Memorial Tribute to Abraham 
Robinson, edited by D. Saracino and V.B. Weispfennig, Lecture Notes in 
Mathematics Vol. 498 (Springer, Berlin) pp. 117-130. 

[1977] A more perspicuous formal system for predicativity, in: Lorenzen Festschrift, to 


appear. 
[1978] Explicit Content of Actual Mathematical Analysis, Perspectives in Mathematical 


Logic (Springer, Berlin), to appear. 
FEFERMAN, S. and C. SPECTOR 
{1962] Incompleteness along paths in progressions of theories, J. Symbolic Logic, 27, 


383-390. 
FRIEDMAN, H. 
[1967] Subsystems of set-theory and analysis, Dissertation, Massachusetts Institute of 
Technology, Cambridge, MA. 
[1969] Bar induction and II}— CA, J. Symbolic Logic, 34, 353-362. 


970 FEFERMAN/ THEORIES OF FINITE TYPE 


[1970a] Iterated inductive definitions and £}— AC, in: Intuitionism and Proof Theory, 
edited by A. Kino, J. Myhill and R.E. Vesley (North-Holland, Amsterdam) pp. 
435-442. 

{1970b] Higher set theory and mathematical practice, Ann. Math. Logic, 2, 325-357. 

[1971] Axiomatic recursive function theory, in: Logic Colloquium ’69, edited by R.O. 
Gandy and C.M.E. Yates (North-Holland, Amsterdam) pp. 113-138. 

[1975] Systems of second-order arithmetic with restricted induction, preprint. 

Ganpy, R.O. 

[1960] Proof of Mostowski’s conjecture, Bull. Acad. Polon. Sci., 8, 571-575. 

[1967] General recursive functions of finite type and hierarchies of functionals, Ann. Fac. 
Sci. Univ. Clermont-Ferrand 4, 5-24. 

GENTZEN, G. 

[1969] The Collected Papers of Gerhard Gentzen, edited by M.E. Szabo (North-Holland, 

Amsterdam). 
GIRARD. J.-Y. 

(1971] Une extension de l’interpretation de Gédel a l’analyse et son application a 
élimination des coupures, in: Proceedings of the Second Scandinavian Logic 
Symposium, edited by J.E. Fenstad (North-Holland, Amsterdam) pp. 63-92. 

GODEL, K. 
[1958] Uber eine bisher noch nicht benutzte Erweiterung des finiten Standpunktes, 
Dialectica, 12, 280-287. 

GRZEGORCZYK, A. 

{1955] Elementarily definable analysis, Fund. Math., 41, 311-338. 
Harrison, J. 

[1968] Recursive pseudo-well-orderings, Trans. Am. Math. Soc., 131, 526-543. 
HILBert, D. and P. BERNAYS 

{1939]) Grundlagen der Mathematik, II (Springer, Berlin, 2nd. ed. 1970). 
Howarp, W.A. 

{1968] Functional interpretation of bar induction by bar recursion, Compositio Math., 20, 
107-124. 

[1972] A system of abstract constructive ordinals, J. Symbolic Logic, 37, 355-374. 

KLEENE, S.C. 

{1959a] Quantification of number-theoretic functions, Compositio Math., 14, 23-40. 

{1959b] Countable functionals, in: Constructivity in Mathematics, edited by A. Heyting 
(North-Holland, Amsterdam) pp. 81-100. 

[1959c] Recursive functionals and quantifiers of finite types, I, Trans. Am. Math. Soc., 91, 
1-52. 

KREISEL, G. 

{1959a] Interpretation of analysis by means of constructive functionals of finite type, in: 
Constructivity in Mathematics, edited by A. Heyting (North-Holland, Amsterdam) 
pp. 101-128. 

[1959b] Analysis of the Cantor-Bendixson theorem by means of the analytic hierarchy, 
Bull. Acad. Polon. Sci., 7, 621-626. 

[1962] The axiom of choice and the class of hyperarithmetic functions, Indag. Math., 24, 
307-319. 

{1968] A survey of proof theory, J. Symbolic Logic, 33, 321-388. 

{1975] Wie die Beweistheorie zu ihren Ordinalzahlen kam und kommt, text of lecture at 
meeting of the German Math. Soc., Tiibingen, Sept. 1975. 

KREIsEL, G. and A. LEvy 

{1968] Reflection principles and their use for establishing the complexity of axiomatic 

systems, Z. Math. Logik Grundlagen Math., 14, 97-142. 


REFERENCES 971 


LORENZEN, P. 

[1951] Mass und Integral in der konstruktiven Analysis, Math. Z., 54, 275-290. 

{1965] Differential und Integral, eine konstruktive Einfiihrung in die klassische Analysis 
(Akad. Verlag, Frankfurt a. M.). 

LUCKHARDT, H. 
{1973] Extensional Gédel Functional Interpretation. A Consistency Proof of Classical 
Analysis, Lecture Notes in Mathematics, Vol. 306 (Springer, Berlin). 
Martin, D.A. 
[1975] Borel determinacy, Ann. of Math., 102, 363-371. 
PRAwITZ, D. 

{1971] Ideas and results in proof theory, in: Proceedings of the Second Scandinavian Logic 

Symposium, edited by J.E. Fenstad (North-Holland, Amsterdam) pp. 235-307. 
Rocers, H. 
[1967] Theory of Recursive Functions and Effective Computability (McGraw-Hill, New 
York). 
SCHUTTE, K. 
{1960] Beweistheorie (Springer, Berlin). 
SHOENFIELD, J.R. 
[1968] A hierarchy based on a type 2 object, Trans. Am. Math. Soc., 134, 103-108. 
SPECKER, E. 

[1971] Ramsey’s theorem does not hold in recursive set theory, in: Logic Colloquium ’69, 
edited by R.O. Gandy and C.E.M. Yates (North-Holland, Amsterdam) pp. 
439-442. 

SPECTOR, C. 

[1962] Provably recursive functionals of analysis, in: Recursive Function Theory, edited by 
J. Dekker, Proceedings of symposia in pure mathematics, Vol. 5 (Am. Math. Soc., 
Providence, RI) pp. 1-27. 

STEEL, J. 
{1974] Forcing with tagged trees, Notices Am. Math. Soc., 21, A627. 
Tait, W.W. 

({1965] Infinitely long terms of transfinite type, in: Formal Systems and Recursive 
Functions, edited by J.N. Crossley and M.A.E. Dummett (North-Holland, Amster- 
dam) pp. 176-185. 

{1968] Normal derivability in classical logic, in: The Syntax and Semantics of Infinitary 
Languages, edited by J. Barwise, Lecture Notes in Mathematics, Vol. 72 (Springer, 
Berlin) pp. 204-236. 

[1970] Applications of the cut-elimination theorem to some sub-substems of classical 
analysis, in Intuitionism and Proof Theory, edited by A. Kino, J. Myhill and R.E. 
Vesley (North-Holland, Amsterdam) pp. 475-488. 

TAKEUTI, G. 
[1973] A conservative extension of Peano arithmetic, unpublished. 
[1975] Proof Theory (North-Holland, Amsterdam). 

ZUCKER, J.l. 

{1973] Iterated inductive definitions, trees and ordinals, in: Metamathematical Investiga- 
tion of Intuitionistic Arithmetic and Analysis, Lecture Notes in Mathematics, Vol. 
344 (Springer, Berlin) pp. 392-453. 


This page intentionally left blank 


D.5 


Aspects of Constructive 
Mathematics 


A.S. TROELSTRA* 


Contents 

1. Introduction. © 2. 2... ee ee ee ee OTH 
2. Logic . . . eh Reece 8! ee? gh OUT 
3. Some jianeuaees founal aiaients aed otations: 

the Gédel negative translation . . 2. 2. 1 ee ee ee ee 982 
4. Realizability and Church's thesis. . 2. 2. - ee + ee es + 986 
5. Some elementary mathematics . . . . . - - «+ + + + + + + 992 
6. Continuity; choice sequences. . . 2 - 1 ee ee ee ee es 1004 
7. Lawless sequences . 2. 1 we ee ee eee ee 1019 
8. Markov’s principle . . . Eta hone eee ge ae ve O27 
9. Truth-value semantics for intuitionistic Jogic: 

validity in all structures. © 2-0 - ee ee ee ee ee ee 1024 
10. Finite type structures © © 6. ee ee eee ee 1026 
11. The Dialectica interpretation . . . ee ees (19 
12. Local and global constructivizations a. Gasieal: checrsins soe oe ee +) 1040 

References 2 2 se 6 ee RR a ee ee ew ORT 


* The author is indebted to G. E. Minc for information concerning Russian constructivism, and to G. Kreisel for criticism 
of an earlier draft of this paper. 

HANDBOOK OF MATHEMATICAL LOGIC 

© North-Holland Publishing Company, 1977 Edited by J. Barwise 


973 


974 TROELSTRA / CONSTRUCTIVE MATHEMATICS {cu. D.S, $1 


1. Introduction 


1.1. ‘‘Constructive’’ in the title of this chapter is meant in a wide sense and 
includes such trends or schools as 

(a) finitism (HiLBerT and BeRNAys [1934, §2c]; KREISEL and KRIVINE 
[1966, App. ITB]); 

(b) constructive recursive analysis CRA, by which we mean constructive 
mathematics in the sense of SANIN [1958, 1964, 1974]; Markov [1962, 1971, 
1974]; 

(c) intuitionism (in the sense of BRouwer [1949, 1952, 1954]; HEYTING 
[1956]; TRoFLstRA [1969]); 

(d) constructivism in the narrow sense (e.g. BisHop [1967}) which may be 
described as intuitionism without choice sequences and without Church’s 
thesis. 


1.2. In this chapter, we approach constructivism as the study of a special 
area in the whole of mathematical experience. Certain aspects 
(‘“‘phenomena” or “‘issues’’) appear naturally in connection with this study; 
here we are primarily concerned with such aspects as have mathematical 
consequences. We do not attempt the systematic development of one 
particular philosophy of mathematics, nor a detailed comparison between 
the various schools of constructivism. 

Nevertheless, we find it convenient to take one particular approach as 
the general background (‘“‘framework’’) for our discussion, viz. a (liberal) 
form of intuitionism. Consequently, this approach has been emphasized. 
The objective reasons for this choice are simple: most of the other 
constructive approaches can be described as restrictions or variants of this 
form of intuitionism (e.g. finitism as a restriction obtained by dropping 
“abstract” objects), but not vice versa. We are not concerned with the 
presentation of this framework as a closed whole, however. 


1.3. Although ‘‘constructive’’ as used in this chapter does not refer to a 
clear-cut, sharply delimited part of mathematics, the various trends have 
something in common: they all satisfy the demands of ‘“‘naive’”’ constructi- 
vism, corresponding to a naive use of ‘“‘constructive’’ which is common- 
place among mathematicians (“naive’’ not used in a derogatory sense 
here). This naive use concerns the interpretation of existential statements: 
a proof of dx Ax is constructive if we can find a particular x (term of our 
language), satisfying A, from the proof. Whatever the precise extent of 
constructivity, one is usually not in doubt as to this requirement (compare 
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this to the naive idea of area of a surface or set, ideas which are 
unproblematic in the case of a finite set and area of a rectangle). A typical 
example is the proof of the following 


THEOREM. There exist two irrational numbers a and b such a’ is rational. 


A non-constructive proof can be given as follows. (\/2)”’ is either 
rational or irrational. In the first case, we may take a = b = V/2; in the 
second case, we may take a = (\/2)”’, b = V/2, since then a’ =2. 

It would be easy to multiply such examples. The demands of naive 
constructivism often can be expressed quite well in mathematical terms 
which do not at all refer to ‘“‘constructivity”: if dx A(x, y:,..., yn) expresses 
the existence of a solution of a diophantine equation with parameters 
yi,.-+-, Yn, there is an obvious difference between a result which gives us a 
bound on the number of solutions for x, in terms of the parameters, and a 
result which gives us a bound on the (size of the) solutions themselves. 

Already naive constructivism leads to interesting metamathematical 
problems; e.g. when we prove something of the form dx ((Ax ax =t)v 
(Ax A x = t)) for known terms ft, f2, can we also prove dx (Ax a x = 4.) or 
dx (Ax ax =t,)? (Cf. the example.) A restriction on the logical rules turns 
out to be necessary; but which restriction? Is there a re-interpretation of 
the logical operators corresponding to this restriction? Such questions lead 
us rapidly out of the domain of naive constructivism. 


1.4. Sometimes, the meaning of ‘“‘constructive”’ is stretched even farther, 
and is taken to include views such as predicativism (e.g. WEYL [1918], or 
Gédel’s footnote (added for the reprinting) in BENACERRAF and PUTNAM 
[1964, p. 211]), where the underlying logic is classical, and where the 
restrictions concern only the definition principles admitted; but classical 
predicativism falls outside the scope of this chapter. 

Another constructivistic trend where the underlying logic is classical, is 
classical recursive analysis RA; this will be occasionally discussed for 
comparison. 


1.5. The principal aspects of constructivism we shall discuss are: 

(i) The role of logic and abstract concepts (e.g. as ‘‘abbreviation’”’, 
making arguments more intelligible); reductions to quantifier-free state- 
ments; interpretation of the logical operations (see Sections 2, 5.7, 7.2, 
11.4). 

(ii) ‘‘Intensional’’ aspects. When interpreted in the widest sense, this is a 
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very common phenomenon, also in classical mathematics; but in construc- 
tive mathematics we often have to pay attention to the way an object is 
given to us, where classically this is not necessary. (Example: a real number 
is given to us as a real-number generator, i.e. a Cauchy sequence of 
rationals.) See for these aspects 4.2(ii), 10.3, 10.4, 11.7(ii). 

(iii) The validity of Church’s thesis (see Section 4). 

(iv) Continuity axioms, the possibility of a theory of continuous 
quantification (Section 6). 

(v) Usefulness of the subjectivistic interpretation (2.2(i), 7.2). 

(vi) The quest for explicit definability (4.15). 

(vii) Markov’s schema (Section 8). 

(viii) Connection between validity for intuitionistic predicate logic and 
mathematical assumptions (Section 9). 

(ix) The existence of classical counterparts to problems of constructive 
mathematics (4.14, 5.14) and systematic procedures for constructivizing 
classical theorems (Section 12). 

Of course, many of these aspects are interrelated: for example, Church’s 
thesis very blatantly requires attention to intensional aspects, and a special 
form of Markov’s schema enters in the discussion of (viii). For expository 
reasons, we have not divided the paper according to these aspects. 

The concept of natural number, induction over the natural numbers, and 
the introduction of rationals and integers we regard as unproblematic and 
is taken for granted. 

In drawing attention to various aspects of constructivism, we shall 
introduce various metamathematical methods which can be used for 
studying those aspects, such as realizability and elimination of choice 
sequences. This chapter is not a survey of constructivism, only a survey of 
aspects; nor do we claim completeness for the bibliography; the items in 
the bibliography are included either for historical reasons or to assist in 
further study of the subjects discussed. 


How to read this chapter. For readers with little or no experience with 
constructive mathematics, the best order is probably to start with Section 1, 
then 2.1 and 2.2, Section 5, 6.1-6.4 (some notations used there are 
explained in 3.7); and then to return to the metamathematical develop- 
ments: 2.3-2.5, Sections 3 and 4, 6.5-6.23, Sections 7-12. 

For those readers who are more or less familiar with constructive 
mathematics, the chapter may be read in regular order, skipping Section 5 
(except for consultation when needed later on). 

Both categories of readers may skip Sections 7 and 9 on a first reading. 
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2. Logic 


2.1. We first describe, as a starting point, the Brouwer—Heyting-Kreisel 
(“BHK”’ for short) explanation of the logical operations. We point out in 
advance that 

(1) we do not aim at a detailed development of these explanations; 

(2) they are not intended to be ‘‘reductive’’, i.e. to give an explanation in 
terms of simpler notions which are already understood (note that the 
classical truth-definition for logical operators is also not “‘reductive”’ in this 
sense); 

(3) the principal purpose of these explanations is to serve as a point of 
reference to compare other interpretations with (e.g. realizability, 4.3). 
The explanation uses the primitive concept of (constructive) proof and 
construction, and tells us the meaning of ‘‘proof of a compound statement” 
in terms of “proof of a constituent”’. 

(a) A proof of A «8B consists in a proof of A and a proof of B. 

(b) A proof of A v B consists in specifying a proof of A ora proof of B. 

(c) A proof of A — B consists of a construction c which transforms any 
proof of A into a proof of B (together with the insight that c has the 
property: d proves A > cd proves B). 

(d) 1 is an unprovable statement. Hence a proof of —A (defined as 
A — L) is a construction which transforms any proof of A into a proof 
of 1. 

(e) If the variable x ranges over a ‘‘basic’”’ domain D (i.e. a domain 
where each construction (object) belonging to it is given as such; in other 
words, the elements d of D ‘‘carry their own proof” that they belong to 
D), we can explain a proof of Wx Ax as a construction c which on 
application to any d € D yields a proof c(d) of Ad, together with the 
insight that c has this property. The natural numbers are an example of 
such a basic domain. 

If D is an arbitrary domain, c should act on a pair d, d’, d an element of 
D, d' a proof that d € D. 

(f) For x ranging over a basic domain D, a proof of 4x Ax is given as a 
pair c,d, c a proof of Ad, d€ D. 

For an arbitrary domain we need a triple (c, d, d’) with ¢ a proof of Ad, 
d' a proof of d € D. 

To illustrate this interpretation by an example, consider a statement of 
the form C =(A -> 4x B); the proof of C contains a transformation 7 of 
proofs of A into proofs of 4x Bx, hence in particular 7 shows how to obtain 
an x such that Bx from a proof of A. Classically, (A—-34xB)—> 
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3x (A — B) is justified; on the BHK-explanation, the x given by 7 may 
depend essentially on (information contained in) a proof of A; we cannot 
expect to determine the x a priori, as required for a proof of dx (A — Bx). 


2.2. Remarks 


2.2.1. It is plausible to assume ‘‘c is a proof of A” to be decidable, i.e. if a 
construction proves an assertion, one should be able to read off the 
assertion from the proof. (In subjectivistic terms: we recognize a proof 
when we see one; if we are in doubt, it is (to us) not a proof.) 


2.2.2. This interpretation of the logical constants presents a first example 
of the introduction of abstract concepts in constructive mathematics 
(‘‘proof’’, “‘construction’’); and it is our understanding of these concepts 
(reflection on them) which enables us to see that the laws of intuitionistic 
predicate logic are valid on this interpretation (whatever the exact exten- 
sion of the concepts of proof and construction is, the explanation is 
sufficiently clear for this). 

In finitism, one does not allow reflection on abstract concepts; one 
restricts oneself to considering ‘“‘visualizable’’ objects. Thus one may 
consider natural numbers and particular rules representing number- 
theoretic functions; but the general (abstract) concept of a rule assigning to 
each natural number another natural number as value is not considered. 

An assertion of the form Vx (t,[x] = 0) (x ranging over natural numbers, 
t, a fixed term with parameter x) may be regarded as (finitistically) 
established if we have a method for establishing t,[n] = 0 for each numeral 
A; t[x]=O0—t[x]=0 is also unproblematic, because equivalent 
to (1- t,[x])t.[x] = 0, but 


Vx (ti[x] = 0) Wx (t[x] = 0) (1) 


cannot be “‘finitistically understood’’ as such; we can establish (1) in a 
finitistically meaningful way by describing a particular t such that 


ti[t{x]]=O0—+4[x]=0 for all x. (2) 


2.2.3. On the BHK-explanation, a logically compound statement such as 
3x Ax or A > B asserts the existence of certain information (i.e. a proof 
of such statements must contain that information) which is not made 
explicit in the statement. Replacing a logically compound mathematical 
statement by a corresponding one in which more of this information is 
made explicit may therefore be described as a move towards a more 
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““‘logic-free’’ formulation. A first example is the replacement of (1) by (2), in 
our discussion of finitism above. 

Another example is the replacement of partially defined operations by 
total ones; for example suppose we have a statement involving a partial 
operation f: 


Vx (Ax > fx a B(fx)). (3) 


(!fx stands for: fx is defined.) For any x and any proof of Ax we can 
construct fx. To show that fx is defined, and for its computation we need 
some information i contained in a proof of Ax; to be definite, assume the 
possible i’s to be coded onto N; making that information explicit, we can 
replace f by f’ and Ax by A’(i,x) (= ‘‘Ax has a proof containing the 
information i’’) and reformulate (3) as 


Vix(A'(i,x)—> B(f'(i, x))) (4) 


where f' is now total. Assuming A'(i, x) to have a lower logical complexity 
than A, we have managed to replace (3) by a more “‘logic-free”’ statement. 
For a concrete example, see 5.7. Note however, that replacing logically 
compound statements by “‘logic-free’’ ones with more explicit information 
may result in a loss of intelligibility; from the viewpoint of constructive 
mathematics the problem is to add ‘‘enough, but not too much” additional 
information. 


2.2.4. Unintended variants of the scheme of explanation given above are 
often more useful (in a technical sense) than the scheme itself: realizability 
is a case in point. The explanation given above is e.g. not accepted as basic 
by constructivists such as A.A. Markov and N.A. Sanin; we return to this 
matter in the discussion of realizability. 


2.2.5. Implication is in character rather similar to universal quantification 
in this scheme of explanation; hence it is not surprising that the nesting of 
implications has to be counted for a suitable measure of logical complexity 
(in contrast to the classical case, where only the number of alternations of 
quantifiers has to be counted). The result of DE JonGH [1973, §5] illustrates 
this remark: there exists an A of L[HAJ such that for no B(x) of L[HA] 
HAt B(#)< g,(A) is provable for all n, where g,(P), n =0,1,... enumer- 
ates all propositional formulae in one variable. 
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2.3. A formal system for intuitionistic predicate calculus (IPC) 


Let us use x, y, Z, u, v, w for individual variables, ¢,t’,... for terms, A, B, C 
for formulae, | for absurdity. The various formalizations of intuitionistic 
predicate logic fall apart into three types: 

(a) Hilbert-type systems (FittinG [1969]; KLEENE [1952]; TROELSTRA 
[1973a, 1.1.3]). 

(b) Systems of natural deduction (PRAwITz [1965]). 

(c) Calculi of sequents (Fittinc {1969]). 

Each of these types has its own special uses, advantages and drawbacks. 
We describe here a Hilbert-type system which is quite convenient for 
metamathematical arguments by induction on the length of derivations, 
such as soundness proofs for interpretations. The system is taken from 
GOpEL [1958]. As metamathematical operators we use 


>,° 
with the obvious interpretation. 


(1) A,A>PBS>B; 

(2) A> B,B>CD>A-C; 

(3) AVA>A, A>AAA; 

(4) APAVB,AAB>A; 

(5) AvB>BvA,AnB>BnaA; 

(6) APB>CvA—CvB; 

(7) ANAB>CDS>A(B—- OC); 

(83) AP(B-C)DAAB-C; 

(9) L> A; 

(10) B> Ax > B—VxAx (x not free in B); 
(11) Vx Ax — At (t free for x in A); 

(12) At—>dAx Ax (t free for x in A); 

(13) Ax > B > 3x Ax—B (x not free in B). 


In (10) and (13) the derivation of the premiss is not supposed to depend 
on assumptions containing x free. 


2.4. The “‘unintended”’ interpretations of intuitionistic logic fall apart into 
two types: 

(i) modifications of the scheme of the BHK-explanation: realizability 
and modified realizability, Dialectica interpretation (see 4.2 and 11.2); 

(ii) ‘‘truth-value’’ type semantics, e.g. topological, Beth, and Kripke 
models. 

The connection with the BHK-explanation is much more indirect in this 
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case. We shall not discuss this type of semantics in detail here. Some 
remarks are contained in Section 9. 

The various truth-value type semantics show that the rules of intuitionis- 
tic logic crop up at various places in the disguise of well-known classical 
structures — and it is perhaps worth mentioning here that the “‘logic”’ of 
forcing is in fact intuitionistic logic (Kripke [1965], pp. 118-120). 

It seems therefore that, quite independently of any philosophical bias, 
intuitionistic logic may claim a certain interest in its own right. 

For the various types of interpretations (semantics) one can of course 
investigate the problem of completeness. For the various ‘‘truth-value”’ 
type semantics there exist at least classical completeness proofs — but on 
the other hand, with respect to the intuitive validity concept as ‘‘construc- 
tively true in all structures’’ we have only partial results. (See the remarks 
in Section 9; a survey of completeness for the intuitive validity concept is in 
TROELSTRA [1977b].) 


2.5. Some formal properties of IPC; references 


2.5.1. Various formal systems for IPC: natural deduction calculus in 
Prawitz [1965]; sequent calculi in Prawitz [1965], Appendix A; KLEENE 
[1952], Ch. XV; Fittinc [1969], Ch. 5, §1; Ch. 6, §4; Hilbert-type calculi in 
TROELSTRA [1973a], 1.1.3, 1.1.4; Fittine [1969], Ch. 5, §7, 8; KLEENE [1952], 
Ch. IV. The references also contain equivalence proofs between various 
systems. 


2.5.2. Already the monadic fragment (with one monadic predicate letter) 
of IPC is undecidable; for a proof, due to D.M. Gabbay, see e.g. SMORYNSKI 
[1973b], p. 115, If] B. The class of prenex formulae is decidable: KLEENE 
[1952], §80 gives a decision method for propositional formulae which is 
readily extended to prenex formulae. 


2.5.3. Cut elimination and normalization: Prawitz [1965], Ch. IV; KLEENE 
[1952], Ch. XIV. 

2.5.4. IPC possesses DP and ED: 

DP tAvB>tAortB, 

ED taxAx >F+At fora suitable term tr; 


this can be obtained as a consequence of 2.5.3. For a generalization of DP, 
ED, see KLEENE [1962, 1963]. 
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2.5.5. The interpolation theorem was first proved in SCHUTTE [1962]; other 
proofs are in PRAwitz [1965], Ch. IV, §3; Firtinc [1969], Ch. 6, 85; an 
extension to the language with equality and function symbols is in 
NAGASHIMA [1966]. This result may be obtained as a corollary of 2.5.3. 


2.5.6. For the connection between CPC (= classical predicate logic) and 
IPC see 3.8-3.10; KLEENE [1952], §81; an extension of this is in Minc and 
OreEvkov [1963]. 


2.5.7. For a more detailed discussion of the BHK-explanation see 
TROELSTRA [1969], §2. For the origin of this explanation see e.g. BROUWER 
[1954]; HeytinG [1931, 1954, 1955]; KreIsEL [1965]. For a rival interpreta- 
tion see MarRTIN-LOF [1975]. 


3. Some languages, formal systems and notations; 
the Godel negative translation 


3.1. The language of first-order arithmetic L, = L[HA] 


The language contains numerical variables (x, y, z, u, v, w), constants 0 
(zero), S (successor), a constant for each primitive recursive function, = 
(equality); furthermore the logical operators 1, v,V,4,—; is identified 
with 0 = SO. 


3.2. The formal system of intuitionistic first-order arithmetic HA 


The system is based on first-order predicate logic, axioms for equality 
and successor as usual, defining axioms for all primitive recursive functions, 
the induction schema. The system is sometimes called Heyting’s 
Arithmetic. 


3.3. Language L, = L[EL] of elementary analysis 


We extend L[HA] with variables (a, b, c,d) for unary number-theoretic 
functions, constants Ap (for application), II (for primitive recursion) and Ax 
(abstraction operator for explicit definition); the logical operators are now 
extended with function quantifiers. 


3.4. Elementary analysis EL 


The axioms and schemata of HA are extended to the language of EL, the 
abstraction operator satisfies 
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(Ax.t{x])t’ = t[t’] 
and the recursor II 
II dt0 = 1, Tdt(Sx) = o(Idtx, x). 
Finally we add 
QF — ACw Vx JyA(x, y)— da Vx A(x,ax) (A quantifier-free). 

As a convention, we shall write EL* instead of EL if we use greek lower 
case letters a, B, y,5 to denote function variables; these variables are 
supposed to be distinct from the variables denoted by a, b, c, d. It is obvious 
that EL is a conservative extension of HA: we only have to interpret the 
function variables as ranging over all total recursive functions. 

It is more interesting, and much more difficult to prove that HA + AC), is 
conservative over HA, where ACo is the schema Vx daA(x,a)> 
3b Vx A(x, (b),); here (b), = Ay. bj(x, y), where j is a (primitive recursive) 
pairing function from NXN onto N. The first proof of this fact is in 
GoopmaNn [1968]; a new, quite different proof is in Minc [1975]. 


3.5. The language L, = L[HAS] of second-order arithmetic 


We add variables X, Y, Z for species (sets) of natural numbers to the 
language of HA; prime formulae are now of the form t=, or Xt, 
(alternatively t,€ X); set-quantifiers are also added. 


3.6. Second-order arithmetic: HAS 
HA is now extended by addition of a comprehension schema 


CA AX Vx [Ax @ Xx] 
(A any formula of L[HAS] not containing X free). 


3.7. Some notations and conventions 


Most of our notations are standard. For reference below, we list here the 
principal ones. 


3.7.1. Pairing, coding of finite sequences 
j is supposed to be a primitive recursive pairing function mapping N’ 
onto N, with primitive recursive inverses. We use as abbreviations 


(x, y) =aer 6] (x, y) ( unary function), 
X (x, y) Saer XJ (x, y) (X set variable). 
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We assume a primitive recursive coding of finite sequences of natural 
numbers onto the natural numbers to be given; 


(Xo,---,Xu) iS the code of Xo,...,X., 
{ )=9, 
¥ der (x). 


Ith indicates the length function, * concatenation, (n), is a function 
satisfying for n = (Xo,..., Xu), 


(n), =x, fory<su, (n), =0 fory>u; 


* and AndAy.(n), are supposed to be primitive recursive. 


3.7.2. Notations concerning functions 
QX =aer(a0,..., a(x — 1)), ad=( ); 
n<m =g.dn'(n*n'=m), n<m=saan<mank4#m; 
a En = 4x (@x =n); 
a(B)~ x =a Ay (a(By)=x +1), !a(B) Hae Dlx (a (8) ~ x); 
(a | B)(x) = y =ser Bz (a (% * Bz) = y +1); 
a | B= y si Wx ((a | B)(x)~ yx); 
tar | B arly (a | B = ¥); 
a = B =a Vx (ax S Bx). 
3.7.3. Notations of elementary recursion theory 


Partial recursive function application is indicated by Kleene brackets 
{:}-; T denotes Kleene’s T-predicate, U the result extracting function. So 


{x}(y)=z<du (T(x, y,u) a Uu = z). 
Let t be a term-expression including Kleene-brackets. Then 
't =¢ is defined, 'tn At =Ax (t=x Aa Ax). 


Ax.t is a Gédel number of f¢ as partial recursive function of x, primitive 
recursive in the other parameters. 


3.7.4. Formal systems 
Formal systems are indicated by boldface letters (H, IPC, HA etc.); L[H] 
is the language of H. Schemata are often denoted by combinations of 
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letters: CONT»), AC— NN, WC-—N!*, chosen in agreement with KREISEL 
and TRoELsTRA [1970], TROELsTRA [1977a]. 


3.8. The negative translation: definition 


For any of the languages considered (and the languages of finite types to 

be discussed later) we define a mapping ' inductively by 
(i) P’=——P for P prime, 

(ii) (A na BY=A'AB’, 

(iii) (A > BY =A'>B’, 

(iv) (Wx A)’ =VXA’, 

(v) (A v BY =7(7A'ATB), 

(vi) (Ax AY =7Vx TA’. 
(Here x is any sort of variable occurring in the system under discussion.) 
We shall agree to simplify ’ for the case of formal systems where the prime 
formulae are decidable by replacing (i) by: (i) P = P’ for P prime. 


3.9. DEFINITION. If A does not contain 3, v we call A d-free. If A is 3-free 
and all prime-formulae in A occur negated, A is said to be negative. (For 
systems with decidable prime formulae negative and J-free formulae 
coincide modulo logical equivalence.) The class of Harrop-formulae A is 
defined inductively by 

(i) doubly negated prime formulae belong to A, 

(ii) A, BEAS>AABEA, 

Gili) AE A>VxA EA, 

(iv) BEASA—>BEA. 
(For systems with decidable prime formulae, the “‘doubly negated” in (i) 
may be omitted.) All negative formulae are equivalent to Harrop formulae. 


Lemma. For a formal system H based on many-sorted intuitionistic predi- 
cate logic, A a Harrop-formula, Ht A<147A. 


Proor. By induction on the complexity of A, using repeatedly the 
logical laws 77(A A B)<?(4 7A ANB), A>27A, TTVXxA> 
Vx TTA, AA(A>B)e(A>77B). O 
THEOREM. For H = HA, HAS, IPC, 

H+A © HEA’. 


Proor. Straightforward by induction on the length of derivations, using 
the preceding lemma. O 
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3.10. Historically, a translation of this type was first given in KOLMOGOROV 
[1925] (for logic). The better-known G6DEL [1933] also treats arithmetic; 
the present variant is due to GENTZEN [1969]. See also KLEENE [1952], §81. 


4. Realizability and Church’s thesis 


4.1. If one accepts the BHK-explanation of the logical operators and the 
natural numbers, the interpretation of HA poses no problems, but we have 
not yet committed ourselves as to the interpretation of the number- 
theoretic functions in EL. 

One possible interpretation is: the function variables range over lawlike 
number-theoretic functions, i.e. functions which are completely given to us 
by a law, a “recipe” for computing a value for each argument. Church’s 
thesis can now, in the language of EL, be expressed as ‘‘every lawlike 
function is recursive’ or formally 


CT Va Ax Vy Az [T(x, y,z) a ay = Uz}. 


From the viewpoint of the Russian constructivist school, this is even the 
only possible interpretation: in their mechanistic approach, “‘recursive’”’ is 
accepted as the mechanically precise form of “‘lawlike’’. In the traditional 
intuitionistic approach, CT is not self-evident — there is a certain gap in 
the arguments which claim to show that mechanically computable = 
humanly computable (for an extensive discussion see KREIsEL [1972)]). 

The interpretation of the logical constants suggests that in fact the 
following axiom of choice should hold: (ACow =) 


AC-NN Vx Ay A(x, y) > da Vx A(x, ax), 


since a proof of the premiss of this implication must contain a method 
which produces for every x a y such that A(x,y); and this method is 
nothing else but a lawlike function. As soon as we start restricting the 
lawlike functions, AC — NN is not obvious any more. AC— NN plus CT 
yields 


CT, Vx AyA(x, y)— du Vx Az [Tuxz & A(x, Uz)| 


which is also meaningful w.r.t. HA. 

From an intuitionistic point of view CT, is problematic, i.e. it is not 
obviously true nor obviously consistent when added to HA. From the 
viewpoint of the Russian constructivist school (CRA) the problem is rather 
to give an interpretation of the logical operators in arithmetical statements 
which is in agreement with CTp. 
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4.2. Remarks 


(i) In the BHK-explanation of logic, and the justification of AC— NN 
just given, we have examples of how certain abstractly formulated concepts 
are accepted and axioms justified by reflection on the abstract concepts. In 
this, the intuitionist approach goes beyond finitism (cf. 2.2.2), and also 
beyond CRA, where ranges of variables are supposed to be listable (i.e. 
effectively enumerable) or (via relativization of quantifiers) definable 
subsets of listable ranges — so AC— NN does not even make sense if we 
think of lawlike functions ‘‘in the abstract’’. 

(ii) CT forces us to pay attention to intensional aspects. For example, 
the justification of AC—NN just given might tempt us to defend a 
generalization 


Vx EX AVE YA(x,y) PAV E(X ~~ Y)Vx EXA(x, Wx). 


If we applied this to CT, we would find 
CT>43WVaVy 4z[T(Wa, y,z) & ay = Uz] 


but, asuming CT to be true, elementary recursion theory tells us that there 
is no W representable as a partial recursive operation on total recursive 
functions such that 


Vx (ax = bx)—> Wa = Wb 


i.e. no W which has functional character w.r.t. extensional equality. It is 
easy to see therefore why the generalization of the choice axiom fails w.r.t. 
extensional equality: the method giving a y to each x € X (an x toeacha 
in the case of CT) can use much more than only the ‘‘extension” of x: in 
principle the method may use all available information on x (and notably, 
in the case of total recursive functions, their Gédel number). 

We shall encounter many more examples where non-extensional infor- 
mation is relevant — in the example above, we can only speak of a 
functional ¥ if we admit non-extensional operations. 


4.3. Realizability 


Now we shall describe a re-interpretation of arithmetical formulas, 
originally devised by Kleene (KLeeNe [1945]) to make the constructive 
interpretation of v and 3 more explicit, which will enable us to show 
consistency of CT, with HA and extensions of HA, and which has produced 
many interesting metamathematical results besides. 
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To each formula A(x,,...,x,) of L[HA], containing at most x,,..., Xn 
free, we shall associate another formula of HA, written as xr A(x),..., Xn) 
(‘‘x realizes A’’) containing at most x, x1,...,%, free, x E{x1,...,x,}. The 
definition is by induction on the complexity of A. 

r(i) x r(t; = ty) = det (t, = tr), 


r(ii) xt(A AB) Ser ix A Ajax eB), 

(iii) = =xt(A v B)=er((ix = 0) fax rA)a(ix4#0> jx BY], 
tiv) xt(A>B)=arVy (yr A > Hx}(y)a{x}(y) 1B), 

rv) xx 1(Ay Ay) Sacer fox 8A (fx), 

(vi) x ©(Wy Ay) =aer Wy (Hx}(y) a {x} (yr Ay). 


Note that if a prime formula is realizable, it is realized by every number, 
also note 

(a) (xr A)ovy (yA > Hx}(y) a{x}(y)F(1 = 0) Vy a(y FA). 
Hence Vx (xr 4 A)<>4dx(xr—A). So a negation is realized by any 
number if it is realized at all. 

(b) XTITA OVy A(yrmA)eVy 7Vz(72z9rA) 

eo nVz(42z9rA)en73F2z (z8A). 

(c) As may be seen by inspection of the clauses of the definition, 
realizability is similar in spirit to (may be viewed as a variant of) the 
BHK-explanation. 

In some respects it is coarser: Vx (t[x]=0) may have many different 
proofs, but all Gédel numbers of all total recursive functions realize such a 
purely universal statement when it is true. On the other hand, realizability 
enforces recursiveness. 


4.4, Derinition. A formula of L[HA] is almost negative if it is constructed 
from prime formulae or formulae 4x (t = s) by means of V, —, a. 


4.5. Lemma. For any almost negative A(a), a a non-empty string of 
numerical variables containing all the variables free in A, there is a partial 
recursive W, such that in HA, 

(i) A(a) >! (a) an Wa (a)r A (a); 

(ii) x (xrA)—>A. 


Proor. By simultaneous induction on the complexity of A, defining W, as 
follows: 
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W,.,=Aa.0 Ws.u-s)= Aa.[j(min, [t = s], 0)); 
Wa.p=Aa.j(V, (a), Ya (a)), WaaN = Aax. Wp (a); 

Wea =Aax.Wa(a,x). O 


4.6. LEMMA. For all A, x tA is (in HA) provably equivalent to an almost 
negative formula. 


Proor. By induction on the complexity of A; for example, x rA — B can 
be rewritten as Vy (yr A > 4z Txyz aWu (Txyu — Uur B); then we can 
apply the induction hypothesis for A, B. O 


4.7, LEMMA. Let ECT, (extended Church’s thesis) denote the following 
schema: 


ECT, Vx [Ax > dy Bxy]— dz Wx [Ax > du (Tzxu a B(x, Uu))). 


(A almost negative). Then there is a numeral ni such that for any instance F 
of ECT), in HAL ark. 


Proor. Let t = {{u}(x)}¥,4(x), then we can take 
n= Au. j(Ax.jit, Axw.j(min, T(Ax. jit, x, v), 7 (0, jot))); 


the verification is straightforward. CU 


4.8. CHARACTERIZATION THEOREM. For sentences A: 
(i) (Soundness) HA+ ECT, A >HAtTar A'for some A. 
(ii) HA+ ECT) FA 3x (xr A), 
(iii) HA+ECT»+ A © HAt Ax (xr A). 


ProorF. (i) By induction on the length of derivations in HA, using Lemma 
4.7. The induction step consists in showing realizability of the universal 
closure of the conclusion from the assumption of the realizability of the 
universal closure of the premisses. 

(ii) By induction on the complexity of A, with Lemma 4.6. 

(iii) Combination of (i) and (ii). O 


4.9. CorOLLARY. HA + ECT) is consistent relative to HA; in fact, since by 
Lemma 4.5 Ax (x tA)<@A for almost negative A, HA + ECT) is a conser- 
vative extension of HA w.r.t. almost negative formulae. 
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4.10. Thus we have obtained also that HA+ CT, is conservative with 
respect to negative formulae. This result can easily be extended to HAS, 
simply by extending the realizability as follows: To each set variable X we 
assign a set variable X* (it may be the identity), and we put 


r(i)’ xr Xy det X* (x, y), 
r(vii) x 1WX A(X) Sar VX* (x FA(X)), 
r(viii) x (4X A(X) =a 3X* (x FA (X)). 


Now we can easily show: 
THEOREM. HAS + CT,+ UP is conservative over HAS w.r.t. negative for- 
mulae; here UP (Uniformity Principle) is the schema 
UP VX Ax A(X, x)—> 3x VX A(X, x). 


(For comments on UP see TRoE strRA [1973b].) 


4.11. Lemma. For any instance F of Markov’s schema 
M Vx [Ax v NAx] A 7—743xAx > 3x Ax 
there is a numeral fn such that 


HA+MtarF. 


Proor. We first note that with CTy, M is equivalent to 

Mer 171A y (t(x, y) = 0) Ay (t(x, y) = 0). 

For if Vx [Ax v 7 Ax], then with CT, there is a z such that 
Wx Ay [Tzxy a (Uy =0—> Ax) a(Uy#0> 7Ax)] 


and thus 4x Ax <dAx dy[Tzxy a Uy =0] which can be expressed as 
du (t(z,u)=0), ¢t primitive recursive. So, since HA+ECT»-} A @& 
HAt Ax (xrA), it is sufficient to show the lemma for instances of Mpr, 
which is straightforward. This enables us to extend the characterization 
theorem to HA+M. O 


4.12. THEOREM. HA+M+ECT,+ A & HA+ME4Ax (xrA). 
Proor. Immediate with 4.8 and 4.11. O 


4.13. From the viewpoint of the Russian constructivist school (CRA), as 
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represented by N.A. Sanin and A.A. Markov, realizability is not a 
semantics since it does not reduce the interpretation of compound state- 
ments to simple ones (i.e. statements whose interpretations are regarded as 
more immediately evident). For example, one objection (SANIN [1958]) is 
that even formulae whose interpretation should be immediate, are re- 
placed by more complicated ones by Kleene’s realizability interpretation. 
However, as to the mathematical consequences, we may certainly regard 
HA +M+ECTy as a codification of the mathematical practice of CRA (cf. 
DraGauin [1973]) in view of Kleene’s proof that Sanin’s interpretation is 
provably equivalent to realizability relative to HA+M (KLEENE [1960]). 
Instead of ECT, one sometimes finds (as e.g. in DRAGALIN [1973]) 


ECT’ Wx[— Ax > Jy Bxy]— 3z Wx [Ax > '{z}(x) a B(x, {z}(x))] 


which however is easily seen to be equivalent to ECT) relative to HA+M 
(noting that almost negative formulae are equivalent to negative formulae 
by Mer (M restricted to primitive recursive A) and for negative 
A, A <@74—A by 3.9, lemma). 


4.14. For classical recursive analysis (RA) a somewhat hybrid situation 
exists: with respect to particular assertions (without parameters) the 
excluded third is used, but a statement from classical mathematics of the 
form Wx [Ax v 7 Ax] is “‘recursivized” by requiring a decision method for 
A, recursive in x. In short, the counterpart in RA of a closed statement A 
is the assertion 4x (x rA), which may be established by classical means. 
RA may be regarded as one possible answer to the search for an analogue 
to “constructive” within classical mathematics: recursiveness in parame- 
ters. For another solution “‘continuity in parameters’’, see 5.14. 


4.15. Extensions and variants of realizability; explicit definability 


4.15.1. A survey of realizability techniques for arithmetic is to be found in 
TROELSTRA [1971]; more detail is given in TROELSTRA (1973a], Ch. II. For 
analysis there are similar methods available; see 6.23. 


4.15.2. One of the technically most useful variants is q-realizability 
(x qA) obtained by changing clauses (iii), (iv), (v) in 4.3 to 
Q(iii) xQ(A v B)=eer[(ix =O Gax QA)AA)A 

(ix #0 (2x QB)» B)), 


Qiv) x q(A > B) =arVy (yQA AA > {x}(y) a {x}(y)qB), 
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q(v) x Q(Sy A) Sar (72x) QA (ix) a A GX) 


and replacing r by q in the other clauses. 
The soundness theorem (cf. 4.8(i)) is then provable for q-realizability and 
yields results such as DP, ED (cf. 2.5.4) for HA, and 


ECRo HA? Ax > dy B(x, y)> 
> HA 3z Wx [Ax > H{z}(x) a B(x {z}(x))] 


(A almost negative). Friedman recently adapted q-realizability to HAS. 
Kleene’s “I | C” is another method particularly suited for proving DP, ED 
and related properties; in FRIEDMAN [1973] this is extended to higher-order 
systems such as HAS. For an exposition of the essentials of this method see 
TRoELstRA [1973a], 3.1.21-3.1.23. 


4.15.3. From the viewpoint of naive contructivism, it seems a natural 
question to ask for the strongest possible subsystem of classical logic still 
preserving DP, ED, when added to the mathematical axioms of the usual 
systems. But there is no unique answer: HA + M and HA + IP both possess 
DP, ED, but HA+ M+ IP = HA‘, i.e. classical first-order arithmetic. (Here 
IP is the schema (7A > 4x B)—> Ax (7A —B), x not free in A; M is 
defined in 4.11.) See TRoELstRA [1973a]. 

To see that HA + M + IP = HA‘, one proves by induction on the logical 
complexity of A that HA+M+IP+A v 7A. Assume A v 7A, then 

(1) Vx(A v7 A)a774xA—>43xA, hence A74xA > 43xA (in- 
duction hypothesis), and by IPSy (4 44x Ax — Ay); by the induction 
hypothesis again ~—-— 4x Ax v dy Ay, hence SxA v 74xA, and 

(2) Vx Ax @ 74x 4 Ax, hence 


Vx Ax v AVWxAx (74x TM Ax v 4 744x T Ax) 
o74dx Ax vax Ax 


(induction hypothesis). 
5. Some elementary mathematics 


5.1. In this section we develop a tiny part of constructive mathematics, in 
order to be able to illustrate some of our issues by means of examples taken 
from mathematics. Although most definitions run parallel to well-known 
classical ones, many classically equivalent definitions are not constructively 
equivalent, hence we have to pay attention to the exact formulation. For a 
readable account of the ‘“‘recursive”’ approach, see e.g. MARTIN-LOF [1970]. 
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5.2. The basic universe 


The theory of integers and rationals is unproblematic. We shall develop 
the theory of real numbers and real-valued functions relative to a certain 
universe % of number-theoretic functions; & is assumed to satisfy the 
axioms for functions of EL*. The principal axiom in EL* concerning 
functions is the quantifier-free axiom of choice QF — AC, which in fact 
modulo the other axioms expresses closure under “recursive in’”’. It is not 
hard to see that our condition on % is in fact equivalent to the following 
three conditions taken together. 

(1) RC U (the recursive functions are contained in %), 

(2) & is closed under pairing, 

(3) % is closed under all continuous operations represented by neigh- 


bourhood functions in . 
The class of neighbourhood functions Ko (relative to % ) is defined by 


Koa =WB Ax (a(Bx) #0) A Wnm (an4 0— an = a(n*m)) 
and the functional , € NY—> N™ represented by a satisfies 


(®.B)(x) = y <> Az (a((x)* Bz) = y +1). 


5.3. The real numbers 


We choose the method of introduction by fundamental sequences. Let 
(r,)n be a standard-enumeration (without repetitions) of the rationals (if 
p/q =m, We may assume p, q to be found primitive recursively in n). 


DEFINITION. (ran)n 1S Said to be a real number generator relative to U 
(U-r.n.g) if there is a B (modulus of convergence) such that 
VK Wm (| rape — Pacer+m)| <2“). (1) 
This is of course equivalent to the existence of a B’ such that 
Vk Wim (|ropce+my — Tapie+my| < 2™*) (1’) 
(given B, we can find B’ by putting B’k = B(k +1) for all k). We shall in 


the sequel omit “relative to %”’ since U is kept fixed. 


DEFINITION. Two rea] number generators (S,),, (fi). With moduli B, y 
respectively are said to be equivalent (notation: (s,). ~ (t,),) if for 
5n = max(Bn, yn), 


| Son = ten | 2°", 
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As in the classical theory, one shows ~ to be an equivalence relation. Now 
the next step would be to define the reals R(%) (R for short) as equivalence 
classes of r.n.g.’s relative to ~. 

However we have not discussed set-existence so far. It is not essential for 
our specific purposes here, since instead of using the equivalence classes we 
could simply express everything by means of predicates and functions 
respecting ~ (i.e. P((tr)n)A(tadn ~ (Sn)n > P(Sn)n etc.); this course is 
adopted by KLEENE and VeEsLey [1965], and in CRA. 

But it is convenient to be able to talk about sets, and therefore as a 
minimum comprehension principle we shall adopt, relative to any given 
language L for which the interpretation already has been provided (in our 
case L[EL*] is the relevant example), the following comprehension princi- 
ple (X a variable for sets of individuals) 


AX Vx.--+ x, [A (1... Xn) X(K1,---, Xn], (2) 


A a formula of L, x1,...,x, variables of L (L is not supposed to contain set 
variables). On the basis of this principle, we can accept well-defined 
properties of elements over a given domain as sets or relations. (2) justifies 
the introduction of reals as equivalence classes (by a metamathematical 
argument it is in fact easy to show that (2) is conservative over EL’, cf. 
TROELSTRA [1973a], 1.9.8). 


5.4. Operations on, and relations between real numbers 


In the usual style we may define operations and relations on reals by 
defining the corresponding operations and relations for r.n.g.’s, afterwards 
showing them to be invariant w.r.t. ~ . So for example 


(Sa)n + (tn)n = (Sn + tn) ns (Sadun * (tan = (Sn * ta dns 
(Sun | =| Sn [)n 


etc., and 
(Sn)n <(tayn = AK ANWM (trim — Sn+m > 2“). 


Then for example, for x, y ER, 
X<y =AS,)n EX Wha EY ((Sn)n <(ta)n)s 
x>y=y<x, 


xSy=r(x>y) xzy=ysx. 


REMARK ON NOTATION. In most of the intuitionistic literature, one writes 
xy for x = y, to avoid the suggestion that x < y is equivalent to x <y v 
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x = y (the latter assertion is much stronger). However, since our = plays 
the same mathematical role as = in classical mathematics, and since 
“y <y vx =y” has no practical importance as a relation between reals, 
and moreover < is easier to read than %, we have preferred here = 
over + 


5.5. Lemma (some properties of +,:,<, =). 


XSyaysxx=y, 
xSyaySz—>x Sz, 
(xSyay<z)v(x<yaysz)>x<z, 
x<yextz<ytz, Z>Oax<yrx-z<y-z, 


xSyovwk(x<yt+2") ATxsyoxsy. 


The proof is left to the reader; see e.g. HEYTING [1956], Ch. II. 


5.6. Apartness and inequality between reals 


Classically, the relations # and # defined by (assuming (s,), € x, 
(tadn Ey) 
x# y =ser 1 (Sn nv ic (tn dm 
x #y Seer dk Im Wn (|Sa+m — trom | > 2°“) 
are equivalent. Relative to intuitionistic logic however, x# y x # y is 
equivalent to Markov’s schema (see 8.2). For mathematical practice, the 
“positive” relation # is more important; the more so since # can be 


defined in terms of #, not vice versa. 
Quite generally, a relation # satisfying 


$1 Tx#yox=y, 
$2 x#y—y #x, 
$3 x#yrx#zvz#y, 


is called an apartness relation. Our relation # not only satisfies S1-S3, but 
in addition, 


x#yrxt+z#yt+z, x#0Ony #2—>xy # xz, 


x#yaox<yvy <x. 
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5.7. Partially defined operations 


In inversion (i.e the operation x + x~' for reals) we meet an example of a 
mapping which is not always defined, at least not in the natural sense (and, 
as will become clear in the sequel, cannot be defined constructively). With 
classical logic we easily remove this defect by an arbitrary convention, e.g. 
putting 0°'=0, (p/q)"'=(q/p) for p,q 0, and then defining 


((Sn dn) =(Sn')n if (Sa)n EO 
= (0), otherwise. 


Then, classically, (s,),' is again an r.n.g. We cannot adopt this method in 
constructive mathematics, since we cannot assert Vx E R(x =Ovx #0). 

Note that if we think of the reals in R* = {x: x ER a x #0} as given to us 
as r.n.g.’s (S,), © x together with a proof that (s,), # 0 (which implies that 
we have a k such that |x|>27“), we see that we can avoid introducing x~' 
as a partial operation by considering pairs (x, k) with the operation 


(x, k) (max(2™“*, x))'. 


Here we have a concrete example of replacing statements by more 
“‘logic-free’’ statements (see 2.2.3). 


5.8. Weak counterexamples and their recursive counterparts 


Traditional intuitionistic literature contains so called ‘“‘weak counterex- 
amples’? to many mathematical assertions (such as: for all x ER, x = 
0 v x#0). These counterexamples are called ‘“‘weak”’ because they do not 
refute the assertions involved, but show that the assumption of the 
existence of an intuitionistic proof for such assertions would lead to the 
solution of an as yet unsolved mathematical problem. 

For example, let Ax be a decidable predicate of natural numbers (i.e. 
Vx (Ax v “ Ax))such that 74x Ax v 47 93x Ax is unknown; a standard 
example of such a predicate is: ‘‘x the number of the first decimal of the 
first sequence 0123456789 in the decimal fraction of 7’. Instead of A itself 
we can also use its characteristic function @ 


Ax eax 0. 
Now define a real x, depending on a@ via an r.n.g. (S,)n E Xa by 


dy <x (ay#0)—s, =0, 
(1) 
ay#Ony=xaVz<y(az =0)>s, =2°’, 
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It is easily verified that (s,), is an r.n.g. and also that 
dy (ay #0) x, = 0, dy (ay 4 0) x, ¥0, 


so xX, =O0v x,#40 is equivalent to “Ay (ay =0)v 47 3y (ay = 0), and 
XxX, =Ovx, #0 to Jy (ay =0)v dy (ay =0). This shows we have no 
grounds to assert Vx (x = 0 v x #0), since this would require a solution to 
dx Ax v 7—73x Ax; and if this particular problem would be solved, we 
could obtain a new weak counterexample starting from another unsolved 
problem of the same type. 

There is a connection with recursively unsolvable problems: the con- 
struction of the preceding counterexample depended on the existence of an 
a such that Jy (ay = 0) v “Jy (ay = 0) was unknown. If we consider an a 
with an extra parameter z this becomes 


Ay (a(y,z)=0)v “Ay (a(y, z)= 0). 
With (a), = Az.a(y, z) it follows that Vx © R(x =0v x #0) implies 


Vz (X(a), =0v Xa), #0) 
hence 
dy (a(y, z) = 0) v Tay (a(y,z)=0) for all z. (2) 


Now let {z: dy a(y, z)} represent an r.e. set which is not recursive; then the 
disjunction in (2) is not recursively decidable in z. As a result, we see that 
for the recursive reals R(R) x = 0 v x #0 cannot be decided recursively in 
a parameter (and inversion cannot be defined on R(R) by a recursive 
operation). 

Another type of weak counterexample, of which we shall encounter 
various examples below, depends on the existence of a problem of the 
following type: Wx (Ax v 7 Ax), Vx (Bx v 7 Bx), (4x Ax a dx Bx), 
but 74x Ax v 44x Bx is unknown. In terms of functions this becomes 


i (Ay (ay = 0) a Ay (By = 0)) holds, while 
(3) 


— dy (ay = 0) v Ay (By = 0) is unknown 


(an instance of such a problem is e.g. Ax = ‘‘2x is the number of the first 
decimal of the first sequence 0123456789 in the decimal expansion of 7” 
and Bx = ‘2x +1 is the number of the first decimal of the first sequence 
0123456789 in the decimal expansion of 7”’). 

(3) can be used to give an example of a real (say xo) such that 
Xo =O Vv xX) =0 (hence a fortiori x9 <0 v x9 = 0 v x9 > 0) is unknown; define 
Xo Via a real number generator (s,), © Xo such that 
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(Ay =n)(ay =0v By =0)— s, = 0, 
am =0aVy<m(am40)amsn->s, =2°", (4) 
Bm =0aAVy<m(Bm40)amsn—>s, = -2". 
It is easy to verify that (s,), is an r.n.g., and that for (s,),€X0 Xo 
0< 714» (ay =0), x =0< Fy (By = 0). 
A recursive version of this counterexample depends on the existence of 


two disjoint r.e. sets which are not recursively separable: if 
{z: dy-(a(y, z) =0)}, {z: By (B(y, z) = 0)} are such sets, then 


Vz 4{3y (a(y, z) = 0)! Sy (B(y, z) = 0)} (5.a) 
while 
Ay (a(y,z) = 0) v Tay (B(y, z) = 0) (5.b) 


1s not recursively decidable in z; or, what amounts to the same, there is no 
recursive y such that 


Vz {(yz = 0 TAy (a(y, z) = 0)) (yz 4 0 Ay (By, z) = 0))}. (S.c) 


For recursive reals this implies that x =0vx =O cannot be decided 
recursively in numerical parameters. 
Another instructive example is discussed in the next subsection. 


5.9. The representation of reals by binary expansions 


Let R, denote the class of reals permitting a binary expansion 
tm+ (> (an)o"”) P an=1_ forall n. 
n=0 


Obviously R, CR, since (+m + Zf_o(a@k)2-*”), is an r.n.g. Using classi- 
cal logic, we can also show RCR,; (i.e. for each r.n.g. we can find an 
equivalent binary expansion). To see this, argue as follows: an element x of 
R is either representable as 


+m+ > (ak)2“"?, ak <1 for0=k <n, 
k=0 
or not (i.e. x is either a ‘‘dyadic rational” or not). In the first case, we are 


done. In the second case, assume for simplicity x € [0,1], and let (s,), be 
an r.n.g. for x such that |x —s,|<2-" for all n, and assume 


t, = > (ak)2-"*? 
k=0 


to be constructed such that 
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t<xX<t, FQ 


(t, =x is excluded). Then since x#t4,+2°°?, (x<4,+2°)v 
(tf, +2°* < x); in the first case we put a(n + 1)=0, in the second case 
a(n +1)=1, and define ¢,., accordingly. 

On the other hand, if we consider 3+ x» (Xo defined via an r.n.g. (s,), aS 
in (4) of the preceding section) and we assume 3+ x» to have a binary 
expansion L7-o(an)2°*”, then a0 =034+ x) S53, @0=103+ x02}, 
which we cannot decide; and this weak counterexample is easily trans- 
formed into a proof of the fact that we cannot find the binary expansions 
for recursive elements of R by operations recursive in numerical param- 
eters. 

Similarly, we can show by classical reasoning that R, is closed under 
+,+, but on the recursive elements of R, this cannot be done by a recursive 
mapping (cf. MAYOH [1965]). 


§.10-5.12. Completeness of R 


5.10. Derinition. (i) A sequence of reals is given by a double sequence of 
r.n.g.’S ((Sm.n)n)m, and a function @ such that, for each m, An.a(m,n) isa 
modulus for (Sm.n)n- 

A sequence (Xm)m Of reals is said to be a Cauchy-sequence if there is a 
modulus B such that 


VkWm (| Xpx = Xpik+m)| < 2°"), 


(ii) A sequence (x,), has limit x (notation lim,x, or lim(x,), =x), if 
there is an @ (limit-modulus) such that 


Vk Wm (|x — Xax+m| < 2°“). 
5.11. THEOREM. R(U) is complete, i.e. every Cauchy -sequence (Xn)n has a 
limit. 
Proor. Let (x,), be a Cauchy-sequence, Xn =(Sim)m, for each n, 
Am.a(n, m)is a modulus for (5, m)m3 8 is a modulus for (x,),. Therefore, 
Wk Vn Vm ([Sneqnn)— Sweatckrrnl < 2°"), 
Vk Vn (| Xpx — Xpk+m | < 2*) 


and therefore, as is easily verified, (8 gn.acan.an+1))n iS an I.n.g.; let x be the 
corresponding real number. It is easy to see (x,), converges to x with 
modulus function An. B(n +1). O 


1000 TROELSTRA / CONSTRUCTIVE MATHEMATICS (cu. D.S, §5 


5.12. Remark. Note that the definitions of r.n.g., ~ , sequence of reals, 
Cauchy-sequence, sequence of reals converging to a limit become equiva- 
lent to their more usual definitions such as 


Vk An Vm (|S, — Siem] <2"), 

Wk An Wm (|Snim — trom] <2-*), 

Vk An Wm (|x — Xnem| < 27"), 
for (Sa)n F.0.2., (Su)n ~ (tn)n, lim,X, = X respectively, on assumption of the 
axiom of choice 
AC-NF Vx da A(x,a)—> JB Vx A(x, (B).). 


Since AC~—NF is implied by ECT», we can in constructive recursive 
analysis (‘‘the Russian school’’) safely assume the usual definitions, just as 
in intuitionistic analysis developed in one of the usual theories of choice 
sequences. 

For classical recursive analysis ((partial) functions from N into N are 
everywhere assumed to be (partial) recursive, but the logic is classical) this 
is not possible, since it is possible to define r.n.g.’s (fan)n With @ not 
recursive by means of classical logic. 


5.13. Real-valued functions 


A real-valued function is a mapping R—R. A real-valued function is 
said to be continuous if there is an operation ®:RXN-—N such that 


Vk Vx ERVy ER(\x —y| <2 °% "| fx — fy| <2). 


A real-valued function is said to be uniformly continuous if there is a 
function a :N->N such that 


Vk Vx ERVy ER(|x —y| <2 >| fx — fy| <2). 


Note that on assumption of ACo (or AC— NN), this is equivalent to the 
more customary formulation 


Vk danWx ERVy ER(|\x-y|<2"->|fx -fy| <2“). 
We define for x,yER, x sy 
(yy)={zix<z<y} 9 [uy]={zix<2<y}. 


Functions continuous or uniformly continuous on (x,y) or [x,y] are 
defined similarly. 
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5.14. The existence of an l.u.b. and the intermediate-value theorem 


We present two counterexamples to well-known theorems from classical 
analysis, showing at the same time how they have parallels in classical 
problems regarding solutions which are continuous in parameters. 


5.14.1. Examp.e. Every uniformly continuous real-valued function f on 
[0, 1] has a least upper bound = sup{fx: x € [0, 1]}, but we cannot expect to 
find effectively an x,€[0,1] such that f(x,)=sup{f(y): y €[0, 1]}. The 
positive statement about the existence of the l.u.b. is easily proved by 
considering 

Xx, = sup{f(k -2-"): O=k =2"} 


and proving (x,), to be a Cauchy-sequence of reals, which must have a 
limit in virtue of the completeness of R(%). 

A weak counterexample is given by defining as in 5.8(4) an xo such that 
Xo > 0, Xo = 0, or Xo <0 is unknown. 

Let f(x) = xox +1. Assume now we could compute an x, s.t. f(x:)= 
sup{f(y): y € [0, 1]}; then x; <iv x, >3. 

If x» >0, x1=1, so 4x,<3; if xo<0, x,=0, hence -x,>%. But 
X,<3V XxX, >4 requires 7 x9 >0 v 1x9 <0, which we do not know. Trans- 
formation into an example showing the non-existence of a solution 
recursive in parameters is now routine (cf. 5.8 and 5.9). 

In terms of classical mathematics we can express the content of the 
counterexample as follows. Small changes in f (in the sense of the 
uniform-continuity topology) result in big changes in the required x ; so we 
can state classically: for uniformly continuous f on [0, 1], we cannot find an 
x, uniformly in f such that f(x.) = sup{f(y): y € [0, 1]}. 


5.14.2. ExampLe (see Fig. 1). Let f be uniformly continuous on [0, 1]. 
f(0) <0, f(1) >0. By a weak counterexample we can show that we cannot 
find an x such that f(x) = 0. For our counterexample we take the uniformly 
continuous f satisfying (x» as before) 


x €[0,3°'+ 37'xo] > f(x) = 3x - 1, 
x E[3-'+3-'xo,2-3°' + 3° 'x0] > f(x) = Xo, 
x €[2-3°7+37'- xo, I f(x) = 3x -2. 


Clearly, if xo<0, f(x) =O0—->x =2-3"' and if xo>0, f(x)=0>x =3". 
Now if f(x:) = 0, x, <37'v x, >2-37', hence (xo <0) v (x0 > 0) which 
we do not know. 
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1p 


aie ! 
ae 1 


or 


Fig. 1. 


Classically, the result has its counterpart in the statement that the 
solution of f(x,)=0 cannot be found continuously in f. Note on the one 
hand that by classical logic, for a recursive f there is always a recursive x 
such that f(x) =0 (by a reasoning quite similar to the argument showing 
that R CR, (5.9)). On the other hand, the standard transformation to a 
recursive version of this counterexample shows that there is no solution 
recursive in f, hence a fortiori not a recursively continuous one. 


Remark. Similarly, we may construct a counterexample to Brouwer’s fixed 
point theorem in one dimension: let f be a uniformly continuous function 
on [0, 1] determined by: x €[2:3'— xo, 1] fx =2-3"', 
x €[3'-—x0,2:3''—xo]ofe =x +X, x €[0,37'- xo] > fx =3°'.f maps 
[0, 1] on to [3°',2+3°'], but we cannot find an x such that fx =x. This 
example can be made into a two-dimensional counterexample in a trivial 
way: g: (x,y) (fx,2 'y +2’), defined on [0, 1] x [0, 1] = I’. 

Classically, the recursive. version of this counterexample to Brouwer’s 
theorem only shows that the fixed point cannot be found recursively in f, 
but in fact in the two-dimensional case much stronger counterexamples can 
be given (OREvKov [1963, 1964]) showing that (a) there exists a continuous 
mapping ¢ of the square I’ into I? such that p(p, dp) = 1/8 for all p E I’, p 
an Euclidean metric, and (b) there exists a uniformly continuous mapping & 
from I? into I? such that Wp € I? (Wz # z). 


5.15. Metric spaces 


5.15. DEFINITION. (i) A metric space is a pair (V,p), where V is a set (the 
set of points of the space), and p a mapping V xX V—R(%), such that 
(1) p(x, y)20, p(x, y)=0<x = y, 
(2) p(x. y) = ply, x), 
(3) p(x, y) + ply, z) = p(x, z). 
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(ii) A complete separable metric space is constructed as follows. Let a 
countable sequence of objects (p,), with metric p be given, say by a B such 
that 


| P (Pi Pi) ~ Fewin| < 2%. 
We now define point-generators (p.g.) similar to r.n.g.’s, p replacing 
Axy |x — y|in the definition; equivalence ((Pan)n ~ (Pgn)n) is also defined as 
for r.n.g.’s. Points are now equivalence classes of p.g.’s; and p is 
automatically extended to all points by defining, if (Pan)n © X, (Pan)n © Y, 
{Pan )nr (Ppn)n P-B-'Ss 


P(X, ¥) =uer lim p (Pans Pan )- 


Each p, is embedded in the space as the equivalence class of (p,)m, and p is 
simply an extension of the p on (p,), modulo this embedding. An arbitrary 
separable metric space is a subspace of a complete separable metric space. 


5.16. Standard representation of a complete separable metric space 

Let Ai.a(k,n,i) enumerate the set 

An. k ger {E: p (pi Pn) <2" }. 
A,,« is easily seen to be enumerable, since 
An, = {i : Ad! (ating < 2S — 2')}. 

(If p(p pn) <2‘, also p(pi, pn) <2“ —2°'"' for some J, hence ragnn< 
P (Pn Pr) + 2° <2* — 27%; and if retin <2* — 274 then also p(p,, pa) <2 “.) 

Now we can associate to any y € &@ asequence (q,.), of points from (p, )n 
such that 

(1) gn = psn, Where 60= yO, and 6(n + 1) = a(n, 6n, yn). 
It is easy to see that 

(2) lim, gn = x > Wm (p(qn, X)<2°"""), and each (q,), defined as in (1) 
has a limit. 
Let us denote the x = lim, q, for (q.), obtained from y as in (1) by x,. 

(3) Wx dy (x = x,). 

(4) Let V5, ={x.: én = yn}, then 


Vx dy (x =x, AWk (U(2-*?, x) C Vx), 


where U(e,x) = {y: p(x, y)< «}. To see this, take x, y such that p(x, y)< 
2*? and let (qn)n (qidn C(Pndn Wn (p(GnX)<2" 7a p(quy)<2"”), 
then there are y, 5 as in (1) such that q, = Psa for all n, x = x,; and since 
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PUGm Jn) <2 1, P(4u-1 94) = P(Gu-1, X) + p(x, y) + p(y, qi) <2" t+ 
2*?+2*?=2* there are also y’, 6’ with 80=y0, 6(n+1)= 
a(n, 5'n, y'N), (Pon)n = Gos +5 Qk-1s Vhs Qk+s,-.. and So y = x,-; since 7'k = 
7k, U(2*?, x) C Vin 


5.17. Examples of complete separable metric spaces 


(a) R, R’, [0, 1). 
(b) If a, B are given we put 


Hn (a, B) = sup{2"': (i <n a a(i) 4 B(i)) vi = nh, 


p(a, B)=lim,pu,(a, 8). p is a metric on N%; this is the constructive 
equivalent of classical Baire space. 

(c) This metric for Baire space obviously carries over to every subset of 
Baire space, hence in particular to the Cantor-set of 0-1-sequences. 

(d) Hilbert space of countably infinite dimension. 


6. Continuity; choice sequences 


6.1. Continuity is more familiar than recursiveness, and, in the case of 
analysis and topology, also closer to the interests of mathematical practice. 
It is not suprising therefore that Brouwer’s paradox: ‘‘all real-valued 
functions are continuous’, and the underlying continuity postulates for 
choice sequences, received much attention — just as the continuity 
theorems of RA and CRA are among the most striking results in these 
areas. 

In this section we shall briefly explain something of the original 
motivation behind the introduction of choice sequences, and show how this 
led to a theory of ‘‘continuous quantification” (choice sequences as a 
“figure of speech’, see 6.3 and 6.14—6.16) with interesting metamathemati- 
cal applications. 


6.2. Heuristic considerations leading to the study of continuity schemata 


First of all we observe that the usual classical examples of discontinuous 
functions defined on R are no counterexamples in intuitionistic mathe- 
matics. For example, consider the function f defined on R by f(x) =0 for 
x <0, f(x) =1 for x >0. From a constructive point of view this function is 
not everywhere defined; if xo is a real as defined in 5.8(4), i.e. x9< Ov 
Xo = 0 v xo >0 is unknown, then we do not know how to compute f(x) to 
any required degree of accuracy; that is to say, we are not able to prove 
that f(xo) is defined. 
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This example suggests that a real-valued function f, in order to be 
defined everywhere, should be such that approximations to the argument 
are sufficient to compute an approximation to the value with prescribed 
degree of accuracy — in short, f should be continuous. In other words, we 
expect that whenever Vx € RAly ER A(x, y) is provable, there must be a 
continuous (in the standard topology of R) f such that Vx ER A(x, fx). 

As we shall see below in 6.4, Va dx-continuity (@ ranging over number 
theoretic functions, x over natural numbers) is enough to ensure this. 

It may be tempting to think of the following assumption: Vx © RAy € 
R A(x, y) > Af Vx ERA (x, fx) (f continuous in the standard topology of 
R) as being equally plausible; but in fact in this case there is a 
counterexample. 


Example. Constructively as well as classically, x* — 3x + a = Ohasa root in 
R for all a E R. However, it is easily seen, there is no f : RR, continuous 
in the standard topology for R, such that Va E R(f*(a)— 3f(a)+ a =0). 

On the other hand, an approximation to the root can be computed from 
an approximation to a, i.e. from an initial segment of a real number 
generator for a. The natural topology for r.n.g.’s is that of zero- 
dimensional Baire space (i.e. a neighbourhood consists of all r.n.g.’s which 
have a certain initial segment in common), and therefore we can rephrase 
the preceding assertion as: there is a continuous mapping (relative to 
Baire-space topology) assigning an r.n.g. for a root of x*—3x + a to each 
r.n.g. for a. Correspondingly, we shall see (in 6.13-6.16) that we can adopt 
a Va 4B-continuity schema. 

Note that the preceding example also illustrates “intensional aspects’’: a 
method ® determining a real y for each real x may use all available 
information about x; and since a real is given to us as an r.n.g., ® acts in 
fact on r.n.g.’s, and it is not necessary that (ri)n ~ (rin > P((In)n) ~ 


P((ri)n). 


Choice sequences. Brouwer’s concept of choice sequence provides a sort of 
theoretical reason for the continuity schemata, which may be described as 
follows. 

A proof of Va 4x A (a, x) implicitly contains an operation ® such that 
VaA(a, a), but in the determination of ®qa all available non-extensional 
information may be used (e.g. Gédel numbers if @ ranges over total 
recursive functions). However, we may also consider “‘unfinished’’ se- 
quences, i.e. processes for determining values to each argument which are 
not a priori determined by a law (extreme example: the sequence of the 
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casts of a die); expressed otherwise, we may enlarge (generalize) our 
concept of sequence by abstracting from the idea that a sequence should 
always be determined by a law, and we retain only the essential feature: to 
each argument n eventually a value an will be determined. For this 
generalized concept, Va 4x A(a,x) becomes a much stronger assertion, 
since the range of a has been extended; and it is plausible that an 
operation ® which should produce an x to each a@ (hence also in the 
extreme case where at any stage in the construction of @ only an initial 
segment of a is known) is necessarily continuous, i.e. ® should satisfy 


Va dx VB € ax (a = OB) 
which results in a continuity schema 
WC-N* WadxA(a,x, y) > Va Ay 3x VB E ay A(B, x, y). 


Assuming further that for a continuous operation ® we can decide 
whether an initial segment ax of the argument a is sufficient to compute 
Pa or not (which means that @ is given to us by a neighbourhood function 
B such that B(@x)#0 << da can be computed from @x and has value 
B(ax)- 1) we can strengthen WC — N* to C—N”™ or CONT, 


CONT, Va dx A(a, x,y) 2 AB E Ki VaA (a, B(a), y) 
where 
KoB aeVa Ax (B(ax) #0) A Vnm (Bn40—> Bn = B(n*m)), (1) 
B(a@) = x =a dy (B(ay) = x + 1), (2) 
A (a, B(a@), y) =ser dx (B(a) = x 0 A (a, x, y)). (3) 


Since most of the mathematical developments in elementary analysis do 
not depend on the assumption that the elements of & are given by a law, 
they should be valid for choice sequences (= the enlarged concept of a 
sequence as described above) as well; and for choice sequences it is 
plausible to assume in addition the continuity schema CONT)». 


6.3. Elimination of choice sequences 


However, our arguments made CONT, only plausible but did not justify 
it in a rigorous way. In fact, at present no simple concept of choice 
sequence is known for which both CONT) and the closure conditions on 
universes % given in 5.2 can be derived. 

Therefore we shall follow another approach in 6.13-6.16 below: we shall 
justify quantification over choice sequences by reinterpretation. There are 
essentially two methods for doing this; the first one is Kleene’s realizability 
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by functions, which is briefly discussed in 6.23; the other method consists in 
elimination of choice sequences by a method of contextual definition. More 
precisely, we consider formal systems containing, besides number quantifi- 
ers, two sorts of functional quantification: ordinary function quantifiers 
(‘constructive function quantifiers’) and choice quantifiers. The meaning 
of the choice quantifiers is then explained by a contextual definition in 
terms of the other logical operators — they are explained as a ‘‘figure of 
speech”’. The definition automatically validates CONT); most of the work 
consists in showing that the choice quantifiers indeed can be assumed to 
satisfy the ordinary quantifier laws (a fact which would have been obvious 
if we could have regarded quantification over choice sequences as a 
quantification over a special sort of objects). 

Before we describe the elimination and its applications in 6.13-6.16, we 
shall first present (1) a mathematical application of WC —N* (for more 
illustrations of the use of WC — N* in mathematics see TROELsTRA [1977a], 
Ch. 6), (2) a brief discussion of the relation between Church’s thesis and 
continuity, and (3) some logical relationships relative to EL*. 


6.4. THEOREM. Let [ =(V,p) be a complete, separable, metric space con- 
structed over the basis points (pn), let (Wi: i € I}, ICN be a covering of I, 
ie. Vx € VAI E I(x € W,). Then {Int(W,): i € I} is again a covering of [ 
(Int(W) = Interior of W), provided the universe U of number theoretic 
functions (see 5.2) satisfies WC—N*. 


Proor. Using the standard representation (5.16) for complete, separable, 
metric spaces, we have 


Va di EI (x, € W,). 
By WC-N*, 
Wa di 3k VB E ak (xp EW, Ai ED). 


With the definition of V., (see 5.10), 
Va Si dk (iE In Vx CW.) 
and thus with property (4) in 5.16, 
Vx diE T(x GInt(W,)). O 
Corotiary. Let F =(V,p), T'’=(V',p') be complete, separable, metric 


spaces with basis points (pa)n (Pin respectively. Any mapping f from I into 
I’ is continuous. 
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Proor. Let U; ={x: p'(pi, fx)<2°°}, vEN fixed; {U,: iG N} covers I. 
By the preceding theorem, {Int(U;): i € N} covers I’, therefore 

Vx © V 3k Fi (U(x,2"*)C U) 
and thus for any x we can find k,i such that 


p(x, y)<2* > fy EU, 
hence p(fx, fy) = p(fx, pi) + p(fy,pi)<2-2". O 


6.5. Church’s thesis and continuity 


For U = &, the set of total recursive functions, WC — N* is manifestly 
false since CT = Va Ax {Wy 3z (Txyz a ay = Uz)} holds but x cannot be 
determined from an initial segment of a only: CT very forcefully asks 
attention for intensional aspects of sequences (namely their Gédel 
number). On the other hand, CT and 


CONT,! Va 3!xA(a,x)— Ay € Kyo VaA (a, y(a)) 


(Ky defined as in 6.2(1)) are compatible, in fact HA + M+ ECT,+ CONT)! 
More is true: in CRA (i.e. HA+M+ECTh, cf. 4.13) the corollary of 6.4 is 
provable (see Ceitin [1959], MoscHovakis [1964]). However, in this case 
the corollary holds so to speak for quite different reasons: the condition 
that an operation, which is defined for all total recursive functions, is 
extensional (depends on the course of values of the functions only) is quite 
strong. 


6.6—6.12. Some logical relationships relative to EL*, bar induction 
6.6. Ko, the set of neighbourhood functions, has already been defined in 
6.2(1). We put 
Kya =arVy Az WB = y (a(Bz) 40) A Vnm (an# 0> an = a(n*m)). (1) 
K,a@ expresses that @ is a neighbourhood function, and that the functional 
®, 

®,B = x 4z (a(Bz)=x +1) 


represented by a@ is uniformly continuous on compact subsets of % (= the 


range of our function quantifiers). 
We introduce K; by a generalized inductive definition 


K1-2 A(a, K,)—> Kza, 
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K3 Va [A(a, Q)— Qa]—>[K2a > Qa], 


where 
A (a, P) =aer dx (@ = An. Sx) v (a0 =OAVWx (An.a(¥*n)EP)). (2) 


K1-2 is in fact equivalent to the conjunction of 
K1 a = An. Sx — K2a, 
K2 a0 =O0A Vx (An. a(% *n) © K2)— Kra. 


K1 and K2 express that K; satifies certain closure conditions, K3 that Kz 
is the minimal set satisfying those closure conditions. 

We note in passing that generalized inductive definitions are regarded as 
an acceptable method of definition in most forms of constructivism (not in 
finitism), e.g. in intuitionism, CRA (Markov [1974]) and_ strict 
constructivism. 

Classically Ky C K2. To see this, note first that for any B © Ky— K2, 
80 =0 (since otherwise B = An. B0€ K; by K1) and also 4x (An. B(X *n) 
€ Ko— K2). Thus, if a © Ko— K2, there exists by an axiom of dependent 
choices a y such that Vx (An. a(yx #n)€ Ko— Kz), and thus Vx (a(yx) = 
0); but this conflicts with a@ € Ko since this implies Wy 4x (a(yx) # 0). 

The converse K2C Ky also holds constructively: apply K3 with Qa = 
VB Ax (a(Bx) #0) and with Qa =Wnm (an# 0 an = a(n*m)). 

Brouwer’s “‘bar theorem’? amounted to accepting Ko= Kz also in- 
tuitionistically. (For an extensive discussion see TROELSTRA [1977a].) We 
can prove more than K;C Ko, namely: 


6.7. LEMMA. K,C K,. 


Proor. Apply K3 with K,a for Qa. O 


The next lemma and theorem give us an axiomatization equivalent to 
K1-3. 


Lemma (Induction over unsecured sequences). In EL*+K1-3 we can 
derive 


TUS K.a— [Wn (an4 0—> Q'n)aVn (Wy Q'(n*¥)> Q'n)> QDI. 


Proof. Apply K3 to 
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Qa =Vm [Wn (an4 00> Q'(m *n)) aAVn (Wy Q'(m *n*¥)> 
—> Q'(m *n))—> Q'm], 


this yields the assertion of the lemma if we take m=( ). O 


6.8. THEOREM. IUS + Va (K2.a > Wnm (an#4 0-> an = a(n *m)))+ K1-2 
is deductively equivalent to K3 relative to EL*. 


Proof. K1,2 are easy. To establish K3, we show that under the hypotheses 
Va (A(a, Q)— Qa) and K2a, Qa holds; this can be done by application of 
IUS to QO'’n=Am.a(n*m)EK, O 


Kleene formulated (e.g. in KLEENE and VESLEy [1965]) Brouwer’s ‘‘bar 
theorem” (induction over the partial ordering of a well-founded tree) as: 
BIp Va Jax P(ax) a Wn (Pn v Pn) aWn(Pn— Qn)a 

AWn (Wy Q(n*¥)> Qn) > Q( >). 
A slightly stronger form is: 
BI Va dx P(ax) a WVnm (Pn —> P(n*m))aAWn(Pn— Qn)a 
AWn (Wy O(n * ¥)—> Qn) > Q( ). 


6.9. THEOREM. EL* + K1-3+ Ko = K2— BlIp. 


Proor. Let 
Va Ax P(ax), (3) 
Wn (Pn— Qn), (4) 
Vn (Pn v 4 Pn), (5) 
Vn (Vy Q(n * ¥)—> Qn). (6) 


Since P is decidable, the function B defined by 
Bn=1 <dm<nPm, 
Bn =0_ otherwise 
is an element of Ko by virtue of (3). Now put 
Q'n =a Qn v dm <nPm 


and assume Vy Q'(n * ¥). If dm <n Pm, or Pn, then Q'n (by the definition 
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of Q’n and (4)). Assuming 44m <n Pm, i.e. TAm <n * y.Pm for all y, 
then by Vy Q'(n*¥), Vy Q(n*9), hence with (6) Qn and thus Q’n. 
We have now shown 


dm <nPmv adm <nPm— (Vy Q'"(n*¥)—> Q'n) 
and therefore by (5) also 
VyQ'(n*¥)> Q'n. 
It is also obvious that 
Bn4#0> Q'n 
and so with the lemma on IUS, and K,= K; we find Q'0, hence QO since 
dm <0 Pm is excluded. OJ 


6.10. THEOREM. Relative to EL*+ CONT», BIp is deductively equivalent 
to BI. 


Proor. Assume BlIp, (3), (4), (6) and 
Vnm (Pn > P(n* m)). (7) 
By CONT) there is a B € Ko such that 


Va P(a(B(a))). 
We put 
P'n 3 ger BNA On Ith(n) = Bn- 1. 


Then obviously VadxP’ax, and since (by (7)) P’n—Pn, also 
Vn (P'n > Qn); Vn (P'n v - P’n) is also obvious, and therefore with BIp, 
Q0 follows. O 


6.11. THEOREM. Relative to EL* + CONT», BI is deductively equivalent to 
transfinite induction 
TI VWadx (ax < a(x +1))>[Vx Vy <x Qy > Qx) > Vx Ox] 


(< a relation definable in EL*). 
Relative to EL*, the schemata Tlp (TI restricted to decidable Q) and Blp 
are equivalent. 


Proor. See Howarp and KreIsEL [1966], Corollary 1 to Theorem 5B, and 
Theorem 6B. OU 


The following theorem is similar to Ky = K.-> BIp derived above. 
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6.12. THEOREM. EL*+ Kp = K,—FANp, where FAN» is the schema 
FAN, Va 3x A(ax)aWn (An v 7 An)—> dz VB < Ax.14x Sz A(x). 


We leave the proof to the reader. 


6.13-6.16. Elimination of choice sequences 
6.13. Description of the systems CS,, CSx, CS 


The developments in this and the next two subsections are necessarily 
sketchy at some places, but sufficiently detailed, we trust, to give a good 
impression of the methods used. For full details see KreiseL and 
TROELSTRA [1970], and TRoEtstrRaA [1974]. 

In the discussion below we shall consider theories H in a language L[H]; 
L{H] contains (1) numerical variables; (2) constructive function variables 
(a, b, c, d); (3) variables for the elements of a certain class K of neighbour- 
hood functions (K-variables; e, f,e',...). 

Note. \f K is a constant definable in H (without the use of K-variables), 
then the addition of K-variables produces systems which are definitional 
extensions of systems without K-variables. 

Furthermore L[H] contains 

(i) all constants of EL, 

(ii) certain constants for elements of K, 

(iii) certain constants for operators of types N>K, K*—K, K’>K 
(notably k,;,:). The presence of such constants together with their 
defining axioms implicitly expresses certain closure properties of K. 

(iv) K-abstraction, indicated by A’: if @ is a K-term (= K-functor) and t 
a numerical term, then A'n. @((t)*n) is again a K-functor. 

(v) operations .|.,.(.) with the following rule of term-formation: if @ is 
a K-functor, ¢’ a functor, then ¢ | ¢' is a functor (= function term), ¢(¢’) 
a numerical term. 

The systems H considered are axiomatized on the basis of many-sorted 
intuitionistic predicate logic, all the axioms and axiom schemata of EL and 
in addition ACoy, = AC — NF. 


AC— NF Vx da A(x,a)— Ab Vx A(x, (b),). 
There are axioms for the elements of K (varying in the different 
applications) implying 


Ve Wb Ax (e(bx) #0), 
| (1) 


Vnm (en4 0 en = e(n*m)), 
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i.e. the elements of K are neighbourhood functions on the range of the 
function variables. (1) justifies the introduction of axioms 


e(a)=x <> Ay (e(@y)=x +1), (2) 
(e |a)(x) = y > 3z (e(% *az)=y +1). (3) 


(2) and (3) show that the introduction of the operations .(.) and . | ; 
amounts to a definitional extension; the reason for introducing K-variables 
is that this permits us to regard .(.) and. | . aS total operations provided the 
range of the first argument is restricted to K-functions. For A’x we have 


(A'n. e(% *n))t = e(X * 2). (4) 


With every element e of K we can associate two continuous functionals 
®, : NNN and W, :N*— NV such that 


®(a)=e(a), W.(a)=ela. 


For the proof of the elimination theorems and their applications K has to 
satisfy certain closure conditions which must be derivable from the axioms 
for K in H. 

A complete list of these closure conditions CC is irrelevant for our 
purposes. For a list CC which is in any case sufficient for the applications 
discussed below, see TROELSTRA [1974]. Among other things, CC expresses 
closure under certain primitive recursive operations :,:, A such that 


(e;fy(aj=e(fla), (e:flla=el(fla), (ela)(fla)=(erf)(a), 
i.e. 
®..,a = D, (Ya), W..,a = VU. (Ya), ®,,,a =(V.a)(Ya). 


We also need an operator mapping sequences onto sequences with initial 
segment n. Let k(n,m) be a function such that 


m=xX*m'ax <Ith(n)>k(n, m)=(n),, 
m=xX*m'ax =lIth(n) a lth(m’)>x—> k(n, m)=(m’),, 
k(n, m)=0 in all other cases. 


We shall assume CC to be such that Am. k(n *m)€ K; assuming that in 
L[H] k“ is a K-term for each ¢ such that in H km = k(n,m), we may 
introduce the abbreviation 


n|@ =aer ka. 


Obviously, 
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n|aEn, a€n—>n|a=a, 
(n|a)(x)=ax for x = Ith(n). 


Now a system H may be embedded in a corresponding theory CS, as 
follows. The language of CS, contains choice variables a, B, y, 6, and the 
rules of term-formation of H are extended in an obvious way: 

(1) If ¢ is a numerical term, ¢@ a choice functor, then ¢f is a numerical 
term. 

(2) If @ is a K-functor, @’ a choice functor, then ¢@ | ¢d' is a choice 
functor, and ¢(¢’) a numerical term. 

(3) If t is a term, then A”x.t is a choice functor. 

In CSy, A is restricted in an obvious manner: A is only to be applied to 
terms not containing choice parameters. The reason for distinct abstraction 
operators A,A‘',A” is that we wish to keep functors, K-functors, choice- 
functors syntactically distinct. (For a complete description of the rules of 
term-formation, see KreisEL and TROELSTRA [1970].) 

Convention. In writing A(qa,...,a@,) etc. for a formula with parameters 
Q@i,...,@n, we Shall assume all the choice parameters to be contained 
among Q,..., Qn. 

To the axioms of H we add 
Al Va dx (e(ax) 4 0), 

A2 Va JBA (a, B)—> Je VaA (a, e | @), 
A3 Va (Aa > Ba) > Ve [VaA (e | a) > Va Ble | a)). 
To obtain the system CS, we add also 
A4 Va daA(a,a)> 
—> de AbVWn(en40—-VaA(n | a, Am.b({en =~ 1)*m))). 
Using closure conditions on K we can show this to imply 


Va de A(a,e)— de 3f Vn (en4#0—> Wa A(n|a,A'm.f((en= 1)*m))). 


By predicate logic, the implication in A3, A4 also holds in the other 
direction. We remark 
(i) A2 implies various forms of Wa 4x-continuity such as 


Va dx A(a,x)< de VaA (a, e(a)), 
Va 4x A(a, x) de Wn (en4#0—>WaA(n|a, en- 1)). 


Note that VaA(n | a, en— 1)<-Va EnA(a,en-1). We leave it to the 
reader to derive these forms from A2. 


cH. D.5, §6} CONTINUITY; CHOICE SEQUENCES 1015 
(ii) Another corollary of A2 is the “specialization principle’, 
SP daAa—daA(aA"x.ax) 


(Aq not containing choice parameters besides a). To see this, note that 
AaAa implies VB da Aa, hence by A2, VBA(e | B) for some e, and 
therefore A(e | A”x.0). 

(iii) It is also not hard to show that 


VaAa @VWaAa 


for A prime, since the truth of Aa for A prime depends on finite initial 
segments of A: since Va(Aav Aa), ie. Wa Ax [(x =00 Aa)v 
(x=1a—7Aa)] it follows that JeVn(en#0->(Va EnAa)v 
(Va €n —Aa)). Therefore, if Va Aa, Va € n - Aa is excluded for all n, 
and thus de Wn (en# 0 Va € nAa) which in view of Va Ax (e(ax) 4 0) 
implies Va Aa. 


6.14. Description of the elimination mapping 7 


The description is by induction on the number of logical operators within 
the scope of a choice quantifier (including the choice quantifiers them- 
selves); we assume v to have been eliminated beforehand by the use of 
(A v B)@ Ax [(x =0->A)a(x40—B)]. We define an auxiliary map- 
ping » for formulae starting with Va,da@ and not containing any choice 
parameters free. The general idea behind the definition of 7 is to push 
choice quantifiers inwards as far as possible. 

(i) da Aa » Ja Aa, 
(ii) Va Aa » WaAa for A prime, 
(iii) Va (Aa a Ba)» Va Aa aVa Ba, 
(iv) Va (Aa > Ba) We (Wa A(e |a)> Va B(e|a)), 
(v) Wa 3x A (a, x) Je Wn (en# 0 VWaA(n | a,en- 1)), 
(vi) Va SBA (a, B)» Je WaA(a,e|a), 
(vii) Va Vx A(a, x)» Vx VaA (a,x), 
Va VaA(a,a)» Va VaA(a,a), 
Va VeA(a,e)r Ve VaA (a, e), 
(viii) Va WBA (a, B)¥ We Wf VaA(ela,fla), 
(ix) Va daA(a,a)r 
+ de Ab Wn (en# 0 VWaA(n|a,An. b((en = 1)*n))), 
Va JeA(a,e)r 
» de Af Wn (fn 40> VaA(n|a,A'n. f((en = 1)*n))). 
Applying » as often as possible we end up with a formula not containing 
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any choice variables or quantifiers; we denote the result of this process 
applied to A by 7(A). 


Note. (1) Formulae of H are left unchanged by +. 

(2) The clauses (i)}-(viii) in the definition of » suffice to define 7 for all A 
which do not contain quantifiers Je, 3a within the scope of a universal 
choice quantifier, and do not contain choice parameters. 

From the definitions and remarks in 6.13-6.14 we obtain: 


6.15. First ELIMINATION THEOREM. (i) If A does not contain quantifiers 3a, 
de within the scope of a universal choice quantifier, then 


CSahtA —7(A). 
(ii) In CSy, for formulae not containing choice parameters free, 
CSyt A @7(A). 
Proor. At each step » replaces a (sub-)formula by an equivalent formula, 
as can be seen by inspection of the clauses of the definition of 7, taking into 


account the various consequences of Al-A4 noted in the preceding 
subsection. 


But we have more: 


6.16. SECOND ELIMINATION THEOREM. 
H' 7(A) & CSyt A. 


Proor. By induction on the length of derivations in CS,,. See KREISEL and 
Troexstra [1970], §7. O 


These theorems permit us to treat the choice quantifiers in CS, as a 
“figure of speech” which can be fully explained relative to H. 


6.17-6.22. Some applications 


6.17. Let H* be the system corresponding to H, but written with variables 
a, B, y, 6 instead of a, b,c, d. Let F be a set of additional postulates, ['* 
their transcription in the language of H*. Let us assume 


H*+IF* CCSy. 
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For sentences A of CS, not containing quantifiers Ja, Je within the scope 
of choice quantifiers, 


CSytA @& HE 7r(A). 
If A* denotes the rewriting of A € L[H] into L{H*], and if 


H+TItaA, 
then also 
H*+Ir*+A* 
hence 
CS,+ A* 


and therefore Ht 7(A *). For arithmetical A, A* = A =7(A*%), and there- 
fore H + I is conservative over H. This observation is used in the following 
two applications: 


6.18. THEOREM. EL+ CONT, is conservative over EL+AC-NF w.rt. 
arithmetical sentences. Here CONT, is WadbA(a,b)>JAe€ 
K.WaA(a,e | a). 


Proor. Let H be a rewriting of EL+ AC-NF in the language L[H] as 
described above, and let K = Ky be the axioms for K; then apply the 
preceding observation. O 


6.19. THEOREM. EL+ CONT,+ FAN. is conservative over EL+AC-NF 
w.r.t. arithmetical sentences. Here FAN is 


FAN Va =Ax.14dxA(ax)—> 
—>4zVa <= Ax.1dy Wb = Ax.1(az = bz > A(b,y)). 


Proor. Similar to the preceding theorem, but now the axioms for K are 
K = K,. oO 


Remark. In view of a result of GOODMAN [1968] (also proved, by different 
methods, in Minc [1975]), that EL + AC-NF is conservative over HA, the 
conclusion in Theorems 6.18 and 6.19 may be strengthened to conserva- 
tiveness over HA. 


6.20. THEOREM. IDB, is the theory H with axioms K = K2 for K. Then 
CSips, = CS is conservative over IDB,. 


Proor. Apply the second elimination theorem. UO 
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Note also that if A does not contain v, 4 within the scope of a universal 
function quantifier, one easily verifies Ht 7(A*)<>A (A* as before the 
result of rewriting a formula A of H in the language of H*). The proof is by 
induction on the number of logical operators within the scope of a 
universal function quantifier in A ; the preceding applications in Theorems 
6.18 and 6.19 permit a slight strengthening by this remark. 


6.21. THEOREM. Let FIM be the system axiomatized as 
EL + AC-NF + BI + CONT, 


(FIM is essentially the system of KLEENE and VesLey [1965]); CS is a 
conservative extension of FIM. 


Proor. (IDB,)* is a subsystem of FIM, if we define K as Ko, the set of 
neighbourhood functions (see 6.2(1)). 
If we take any sentence A in the language of FIM such that CSt A, then 


IDB,} 7(A) 


while on the other hand, since FIM+(7(A))*, and FIM#(7(A))*oA 
(since A did not contain quantifiers Je, Ja, there was no need to use the 
postulate A4 in proving this equivalence) it follows that FIMF A. O 


6.22. For future use we note the following lemma, the proof of which is left 
to the reader: 


Lemma. Let CStVa [Wx A(x,a)—>Vu dvB(u,v,a)], A, B quantifier- 
free, a the only choice parameter in A, B. Then, with IDB, as in 6.20, 


IDB, + Va [Vx A(x, a) > Vu dv Bu, v, a)]. 


6.23. Realizability by functions 


Kleene gave a reinterpretation for the system FIM (defined in Theorem 
6.20 of the previous subsection) by means of realizability which interprets 
FIM in B (essentially EL* + AC-NF + BI), the basic system which is also 
compatible with classical logic (see KLEENE [1969], and KLEENE and VESLEY 
[1965}). 

The definition is based on continuous function-application .| . instead of 
partial-recursive function-application: 


a|B~ y =arWx (a(% * B min, [a(% * Bz) X 0])~ 1= yx). 
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If we define ‘‘almost-negative’’ as before, but now permitting also existen- 
tial function quantifiers, and we write ‘‘a realizes A(B,,...,B8,) in this 
new sense” as “a r' A(f,,...,8,)” then we can formulate a characteriza- 
tion theorem similar to 4.8. 


THEOREM. (i) For almost negative A, EL*+ A 3a (ar' A). 
(ii) Let the schema of generalized continuity GC be defined as 


GC Va [Aa > 3B B(a, B)]> ay Va [Aa >! | an Bla, y|@)] 
with A almost negative, !y la expressing 45 (y la = 6). Then 


EL*+GCFA @da(ar'A) forall A, 
and 


H+GCtA @Htda(ar'A) for H=EL*, EL*+ BI, EL* + FAN. 


7. Lawless sequences 


7.1. In the preceding section we only presented a rough intuitive plausi- 
bility argument for the continuity axioms for choice sequences, and then 
proceeded to the treatment of choice sequences as a ‘‘figure of speech’. At 
present, no really satisfactory concept of choice sequence satisfying the 
axioms of CS is known — for a discussion of this problem, we refer to 
TROELSTRA [1977a], Appendix C. 

There is one example of a simple concept of sequence (not a priori 
determined by a law) where it is possible to give an informal, but rigorous 
justification of the axioms (including continuity axioms): lawless sequences. 
The concept is meaningful from an intuitionistic viewpoint, but is of course 
illegitimate from the point of view of CRA. 

Lawless sequences are most easily described as follows. We think of a 
lawless sequence as a process of determining values, such that eventually, 
for any given natural number, a corresponding term (value) of the 
sequences will be determined; at any given stage however, only finitely 
many terms of the sequence have been defined. We postulate that to each 
prescribed finite sequence n there is a lawless sequence starting with n. Of 
course, one may also consider sequences lawless relative to a given 
(lawlike) finitely branching tree: i.e. we know a priori that the sequence will 
be an infinite branch of the tree, but otherwise the sequence is lawless. A 
good *‘model”’ for a lawless sequence (with values x, 1 = x <6) is provided 
by the sequence of the casts of a die, provided we permit at the beginning a 
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finite number of deliberate placings of the die. Different lawless sequences 
may be identified in the model with different dice. 


7.2. Some principles which can be shown to be valid for lawless sequences 
are (using €, 7 as variables for lawless sequences) 


Ae > adnle EnaVn EnAn], (1) 
Ve Ax A(e, x) > de € Kis Ve A(e, e(e)), (2) 


where 
e © Kis =aerWe 3x (e (Ex) 40) A Vnm (en# 0 en = e(n*m)) 


and the ‘extension principle” 
e € Kis Va Sx (e(ax) 4 0). (3) 
As stated in 7.1, we also postulate 
Vnde(e En). (4) 


Let us consider for instance (1). Suppose we have a proof of Ae; since 
this proof must be completed at a certain stage, it must depend exclusively 
on our knowledge of the initial segment of « known at that stage, say n; 
hence the property A should equally hold for all lawless 4 with initial 
segment n. 

(2) can be justified in the spirit of 6.2, but now even more convincingly, 
since for all lawless sequences, at any stage, only an initial segment is 
known. 

(3) expresses that neighbourhood functions for type N“— N functionals 
on lawless sequences automatically extend to functionals defined on 
lawlike sequences as well. For a discussion of the extension principle see 
TROELSTRA [1977a], 2.11. 


Remark. (i) Note that the justification of (1) is given, quite naturally, in 
subjectivistic terms (“what does the idealized mathematician know at a 
certain stage’). The justification of the axioms for lawless sequences is 
discussed at much greater length in Ch. 2 of TRoeLstra [1977a]. The 
justification of (i) also provides another example of a derivation of axioms 
by reflection on abstract concepts. 

(ii) The interest of the lawless sequences is not only pedagogical; 
although we have introduced them here mainly to demonstrate the 
possibility of deriving rigorously axioms for a concept of sequence which is 
an “incomplete object’’, and to show the use of the subjectivistic interpre- 
tation, they have other uses besides: (1) the construction of universes of 
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sequences which are suitable for (real-valued) analysis (cf. Ch. 4 of 
TROELSTRA [1977a]); and (2) in the discussion of validity and completeness 
of intuitionistic predicate logic (see Section 9, and TRoELstTRA [1977b]; 
{1977a], Ch. 7). 


8. Markov’s principle 


8.1. Considerable interest attaches to ‘“‘Markov’s schema’”’ which can be 
stated as 


M Vx (Ax v MAx) a7 73x Ax > 3x Ax 


(A containing numerical parameters only) and its variants and weakenings. 
We may paraphrase M as follows: suppose A to be a predicate of natural 
numbers which can be tested for each number; and we know by indirect 
arguments that there should be an x such that Ax; then we also believe 
that a computer with unbounded memory (or an algorithm) asked to search 
for an x such that Ax will eventually find one (given enough time). This 
paraphrase does not fit the variant where we permit e.g. choice sequences 
as parameters; and in fact, since M was first considered in the context of 
CRA (cf. Markov [1962]), its origin is purely mechanistic. 

Nevertheless, there are interesting problems connected with the validity 
of (the analogue of) M where non-numerical parameters are present, so we 
shall also discuss (in Section 8.3) 


M(%) Va [17 4x (ax = 0)— Ax (ax =0)| 


where a is supposed to range over a universe U of sequences. 
Note that M itself, in the presence of CTo, is equivalent to the special 
form 


Mor Vxy [a7 43z Txyz > Az Txyz] 
or equivalently 


174xAx—-3dxAx (A primitive recursive). 


8.2. Mathematical consequences of M and M(%) 


An easy consequence, which considerably simplifies the theory of reals 
and real-valued functions, is given by the following: 


THEOREM. For reals ranging over R(%), 
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Vx (x4 00x #0)OM(%). 


Remark. In the presence of AC— NN! (= ACw!), 
Vx Aly A(x, y)— da Vx A(x, ax) 


M(%) is equivalent to the variant of M where parameters ranging over % 
are permitted. 


Proor. Let x € R(X), (ran)n EX, X#0, Wk (|x — rex | <2*). Since x 40, 
Wk (|rax|<2™*), hence “Wk 7(|rox |2=2™). [rae | 22 is expressible 
by a quantifier-free statement in the language of EL, so by QF — AC there 
is a b such that bk =0<>(|ra | =27*); with M, Sk (| ra | =27*). [x — raz |< 
2*—2™ for a suitable n, hence |x|=||ra|—|x —rax||>2°", so x #0. 
Conversely, if Vx (x#0—> x #0), (Si), EX, (Si)n defined by s, =0 if 
7 3dy <n (ay =0), s, =2“* ifk = min,[ay = 0], then Jy (ay = 0) x #0, 
— Ay (ay =0)< x40, hence by Vx (x#0—x #0), M follows. O 


Far more interesting is the continuity theorem of CRA: with the help of 
M one can show that the continuity of mappings from complete separable 
metric spaces into separable metric spaces (see e.g. MoscHOVAKIS [1964], 
which is written from the viewpointof classical recursive analysis, but which 
can be adapted easily to CRA). We have not formulated the most general 
theorem of this kind here. 

In BEESON [1975] (exposition also in TROELSTRA [1973a], §3.9) it is shown 
that HA+ ECT, is not sufficient to derive this continuity theorem. Inspec- 
tion of the proof of the continuity theorem in CRA shows that continuity 
holds in this case for reasons quite different from the reasons leading to 
continuity in the case of choice sequences. 


8.3. Markov’s principle with choice parameters 


Note first of all that M(LS) (i.e. M(%) with % universe of lawless 
sequences) is obviously false, for then 


Va 47 4x (ax = 0) a Va 3x (ax = 0). 


This is seen as follows: — 3x (ax = 0) implies the existence of an n such 
that a En, VB © n -43x (Bx =0) which is obviously false (take a BE 
n *({0)); and Wa 4x (ax = 0) would imply the existence of an e such that 
Va (a(e(a))=0), Wa Ax (e(ax)40), Wnm (en#0—> en =e(n*m)), 
Va Ax (e(ax) #0); but then also e(Ax.1) is defined, and computable from 
(Ax. 1)y =n; therefore if we choose Ith(n)>en-1, it follows that 
(Ax.1)(en- 1) =6. 
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For many other notions of choice sequence Markov’s principle is suspect 
as well. Let us consider the following very special form 


Mér Va<=Ax.1774xA(ax)> 
> a3-7Va =Ax.14xA(ax) (A primitive recursive). 
Note that Miz is a consequence of M(%) if a@ ranges over U and % is 


closed under lawlike continuous operations. We have: 


Proposition. Let H be a theory as described in 6.13, and let CS* be similar 
to CS, but with Wa Ax (e(ax) A 0) left out; let U be the range of the choice 


variables. Then 
(i) M(%)+ CS*+ Va Ax (e(ax) 4 0), 
(ii) in CSa+ Mat CT, where as before 


CT Va Ax Wy dz (Txyz a ay = Uz). 


ProoF. (i) Any e satisfies (provably in H) 
Vb Ax (e(bx) #0),  Wnm (en#0—> en =e(n*m)). (1) 
If Ja Aa is closed, then Ja Aa > Ja Aa: apply A2 of 6.13 to VB Ja Aa, 


then JeWVaA(ela), hence A(e |A"x.0). Applying this to da 3a 
(a = a) yields a contradiction, hence Va — — Ja (a@ = a). Then (1) implies 


VB Ax (e(Bx) 4 0). 


Now apply M(%) with Ax.(1— e(6x)) for a, then VB Ax (e (Bx) #0). 

(ii) Kleene constructed a binary tree with primitive recursive charac- 
teristic function which is well-founded w.r.t. recursive sequences but not 
uniformly bounded (see e.g. KLEENE and VEsLEy [1965], Lemma 9.8). So, 
on assumption of CT, there is a primitive recursive g such that 


Va 3x (¢(ax) 4 0), 
Vnm (gn #4 0 pn = g(n*m)), 


but ¢ €& K,. On the other hand, as in the proof of (i) it follows that 
Va 4-1 4x (¢(ax) A 0), and therefore with Mir, 


a7Va = Ax.1(¢(ax) #0), 


but this is obviously false, since K C Ki, but ¢ € K, (Kj, as defined in 
6.6(1)), hence 7Va Ax (y(ax) 40). O 


Remark. (a) Crucial in the proof of (i) and (ii) is the property 
Va 4 —da (a = a); this is false for lawless sequences. 
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(b) The result of 9.5 shows that completeness of IPC implies Mér, hence 
— CT; here the relations and domains in the definition of validity may 
contain lawless and cheice parameters, as specified in 9.5. Compare 
also 9.3. 


9. Truth-value semantics for intuitionistic logic; validity in all structures 


9.1. The-principal interpretations of intuitionistic logic with the character 
of a truth-value semantics considered in the literature are: 

(a) Valuations in complete pseudo-Boolean algebras (or pseudo- 
Boolean valued models, PBM’s for short). Detailed treatment in Rasiowa 
and Sikorski [1963]. 

(b) Topological models (valuations in the algebra of open subsets of a 
topological space). See e.g. RAsiowa and Sikorski [1963]. Applications in 
Scotr [1968, 1970], J. R. MoscHovakis [1973]. Topological models are 
special cases of PBM’s. 

(c) Beth models (principal sources BETH [1959], Kripke [1965]). Beth 
models are special cases of topological models. 

(d) Kripke models (Kripke [1965], Fittinc [1969]; applications in 
SmoryYNSKI [1973a]). Kripke models are defined relative to partially or- 
dered sets; if the partially ordered sets are restricted to countable trees, 
then Kripke models can be transformed by a standard procedure into a 
Beth model satisfying the same sentences (see Kripke [1965]). Kripke 
models are always special cases of topological models. 

Classically established completeness theorems for the various types of 
semantics mentioned above can be used to obtain interesting results on 
intuitionistic formal systems — good examples are to be found in 
SMORYNSKI [1973a]. Via a detour (formalization of the proof of the com- 
pleteness theorem in a classical system, and conservativeness of classical 
systems over the corresponding intuitionistic systems w.r.t. II>-formulae) 
many of these results can also be established intuitionistically. The 
methods are similar in character to the techniques of classical model 
theory, treated elsewhere in this volume. Therefore we shall refrain from 
discussing these semantics here, but turn to the notion of (intuitionistic or 
constructive) validity in all stuctures instead. 


9.2. Validity and completeness 


We define ‘‘validity in all structures” in complete analogy with the 
classical case: 


cu. D.5, §9] TRUTH-VALUE SEMANTICS 1025 


DEFINITION. Let A(Pi,...,P,) be a formula of IPC, with all its predicate 
letters contained among P,,...,P,; then we put 


Val(A ) =arVD WP*---WP*A?(P%,..., P*) 


where D is a (intuitionistically meaningful) domain, P* a relation over D 
with the same number of arguments as P,, A? (P?,..., P*) obtained from 
A by relativizing quantifiers to D and replacing P; by P? (l=i<n). 


The following results illustrate the influence of mathematical assump- 
tions on the extension of {'A!: Val(A )}. 


9.3. THEOREM. Assuming CT, and restricting domains and relations to 
completely defined ones (i.e. not containing non-lawlike parameters) 
{'Al: Val(A )} is not r.e. (KREISEL [1970], exposition in VAN DALEN [1973]). 


Remark. In fact, the restriction to completely defined domains and 
relations can be weakened considerably. 

On the other hand, suppose we restrict D to N, and the P? to numerical 
relations containing a lawless parameter (more precisely, lawless relative to 
a finitely branching tree), then we obtain completeness for prenex formulae 
of IPC: 


9.4. THEOREM. For A prenex, 


Val(A )— 3x Proofwc(x, 'A!) 
and 
3x Proofwc(x,'A!)> APi,..., Pi aWeA™(Pi,..., Pr), 


where P; is a predicate of natural numbers depending on a lawless param- 
eter &. 


The weak form of Markov’s principle for choice sequences, Méa plays a 
role in the following result: 


9.5. THEOREM (Dyson and KrelseL [1961], KreiseL [1962], TROELSTRA 
[1977a], Ch. 7). Let, in Val(A), D=N, P* CN’, P* containing, besides 
lawless parameters, choice parameters a for sequences of 0, 1 which are 
assumed to satisfy 

VnE€ Bada (a€En), 


Va =Ax.14xA(a@x)—>4z Va SAx.14x =zA(ax), 
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where A is quantifier-free and n © B =a Wx <Ith(n) ((n), = 1), then weak 
completeness 


Val(A)— 7 3x Proofirc(x, 'A!) 
for all A of IPC is equivalent to Mbp. 


For much more details and a leisurely discussion, see TROELSTRA [1977a, 
Ch. 7; 1977b]. 


10. Finite type structures 


10.1. The language L(N-HA“) 


The collection of finite types T is defined inductively by 

(i) OE T, 

(ii) 9, TET > (oc) ET. 
A model for this type structure is given by specifying a set D, for each 
o €T; in our examples, Do = N, the set of natural numbers; D;.), consists 
of a set of operations (depending on the model) assigning objects of D, to 
objects of D,. We have preferred to use the neutral term “operation” 
instead of ‘“‘function”’ or ‘“‘mapping”’ since the use of the latter expressions 
might suggest to the reader functions which are extensional in the sense of 
classical set theory, whereas we also wish to consider models where the 
operations are non-extensional. 

L(N-HA”) contains variables for each type (x°, y’,z°,u’, v’, w’; type 
superscripts are often omitted); equality =, for all types o, and constants 


060, SEO), M,,€(e)(r)o, 
Xp.0.7E ((p)(7)7) (po) (Pp), 
R, €(7)((0)(a)7)(0)o ‘for all p,o, 7 ET; 


we use here as in the sequel t € o to indicate that ¢ is a term of type o. 
Application from type (a )7 to type o is also present and simply denoted by 
juxtaposition; t,:--t, abbreviates (---((titz)ts)ts-- + )th. 


10.2. The theory N-HA” 


The theory is based on many-sorted intuitionistic predicate logic, the 
usual axioms for successor, “‘defining axioms”’ for IJ, 3, R specified below, 
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and for equality, besides symmetry, reflexivity and transitivity also sub- 
stitutivity 
(e)tL eo _ (a)t,,o 


xO = yr = ZOOM", 


xr _ yet pax x Orz7 = yore a 


We shall frequently omit type superscripts from now on, types are assumed 
to be coherent. The defining axioms for /7, 3, R are 


IIxy =x, xyz =xz(yz), RxyO=x, Rxy(Sz)= y(Rxyz)z. 


10.3. Models for N-HA” 


Detailed information and further references concerning the models 
discussed here in TROELSTRA [1973a], Ch. 2. 


10.3.1. The lawlike operations LO (GOpEL [1958], called ‘‘berechenbare 
Funktion” there.) 

Dic), consists of all lawlike operators assigning elements of D, to all 
elements of D,. Relying on the intuitive meaning of lawlike, there are 
obviously representatives of IJ, ,R in this model. t =,t' means: ¢, t’ are 
given to us as the same law (=, is intensional equality). 


10.3.2. The hereditarily recursive operations HRO 
We put 
V(x) det x= x, 


Veorw(X) =aeWVy € Vo Az € V,({x}(y) = 2). 


We put D, = {(x, 7): x © V.} in this model. Application is partial recursive 
function application 
({x},(7)7)(y, 0) = (x}(y), 7) 
and 
(% &) = (y, 0) =aer X = y. 


We can show HRO to be a model for N— HA” if we can find numbers {0], 
[S)], [Uc..], [%,«..], [R-] such that the defining equations are satisfied, e.g. 
(U7), (7) 7) (x, 7) (y, 7) = (x, 7), which means that we have to find [JT] such 
that {{[/7]}(x)}(y) = x for x € V,, y € V.; we may take AxAy.x for [JT ,}. 
Similarly, [0] =0, [S]= Ax.x +1, [%,.,.] = Axyz.{{x}(z)}(y}(z)). The 
most complicated case is [R.], which can be found by an application of the 
recursion theorem (which is redundant in case recursion is already under 
the basic schemata for recursive functions). 
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HRO may be viewed as a sort of ‘‘recursive approximation” to LO. For 
LO, intuitively, a choice axiom 


AC,., Vx" Ay" A(x, y) > 4z©” Vx" A(x, zx) 


is a consequence of the meaning of the quantifier combination Vx Jy (but 
indeed only if we permit the most general kind of non-extensional 
operations defined on D, as elements of D,.),). 

For HRO, consistency of AC,., is provable relative HA — in fact, AC... 
is derivable for HRO in HA+ ECTp. 


10.3.3. The hereditarily effective operations HEO 
An extensional model is obtained by defining simultaneously for each 
ao &T equivalence relations I, and domains W,: 


I(x, y)=x = y, W(x) =x =x, 
Tear (X,Y) = Wooyr(X) A Weare ly) AWz © Wz ({x}(z) = {y}(z)), 
Woo. (x) = Vy € W, (az € W, ({x}(y) =z) aVy'& W, (ey, y')> 
> Ex} (y), {x}(y))). 


D, for this model now consists of pairs (x, a7) with x € W,. Application is 
defined as before, and 


(x, a) =e (y, a) = det I, (x, y). 
Interpretations of R., [T..,,Xp.0.,0, S are constructed as for HRO. 
10.3.4. The intensional continuous functionals ICF(% ) 


This an analogue to HRO, but now based on continuous function 
application instead of partial-recursive function application. We define 


Vi(x) =x =x, Vi(a)=a =a, 


View(a) =Wy € Vi Ax (a(y)=x) (740), 
where 
a(y) = x Sera (7 min, [a (7z) 4 O}) = x $1, 


Vina) =Wx ay E Vi(a|aAz.x=y) (740) 
and for o,7#40, 
Viana) =VB E V. dy € V, (a |B =), 


where (a | B)= y =Vx (An. a(x *n)(B) = yx). The objects of type 0 are 
the pairs (x,0), objects of type o, 7# 0 are the pairs (a,a) with a € Vi. 
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Equality is interpreted as 
(a, 7) =o (B, ©) Sar Vx (ax = BX), 
and application by (0, 7# 0) 
(a, 1)(x,0) = (ax,0), (a, (0)o) (x, 0) = (a | Az. x, 0), 
(a, (0)0)(B, 7) =(a(B),0), (a, (7)r)(B, 7) = (a | B, 7). 


From the formalized theory of recursive functionals one then obtains 
interpretations of R,, [,,;, Do.0.1 5,0. 

The properties of ICF(%) depend on % of course: if & consists of the 
recursive functions, type 2 functionals in ICF are not necessarily uniformly 
continuous on bounded subsets of functions; if % satisfies the fan theorem, 
uniform continuity holds. 


10.3.5. The countable or extensional continuous functionals ECF(% ) 

This is the extensional analogue of ICP(%), related to ECF(%) in the 
same manner as HEO to HRO (KLEENE [1959], KretseL [1959]). 

Other models of N-HA®” which we shall not discuss here are (a) 
recursive objects of higher type, (b) term models, (c) hereditarily majoriz- 
able functionals (Howarp [1973]), (d) the hybrid ICF’(%) obtained by 
restricting ICF(%) to its recursive elements. 


10.4, The uses of non-extensional models 


10.4.1. As has been remarked above, HRO may be regarded as a recursive 
approximation to the more “‘fundamental’” model LO. As an object of 
study, HRO is more manageable than LO because it is defined in terms of 
well-known notions. The example of HRO shows that there is nothing 
paradoxical or incoherent in the concept of a decidable non-extensional 
notion of equality. In fact, HRO is not only a model of N— HA”, but also of 
the theory I— HA®, obtained from N — HA® by addition of the constants E, 
satisfying 


E,xy=1, E.xy =0<x=.y. 


I-HA® may be viewed as a theory of finite type objects with strict 
(intensional) equality: two objects are intensionally equal (identical) if they 
are given to us as the same object; and HRO shows that I- HA® has a 
simple model. It has been argued that the introduction of intensional 
equality in the language depends on a confusion between ‘‘use” and 


“mention’’. But even if one assumes the objects of a type structure to be 


1030 TROELSTRA / CONSTRUCTIVE MATHEMATICS (cH. D.5, §10 


rules given by a linguistic representation, there is still the distinction 
between the rules as mathematical objects and their names (where it is 
irrelevant that the names may be such that they contain complete 
descriptions of the rules). HRO illustrates the point — all its objects may 
be conceived as rules, but they do not all have names (= closed terms) in 
N-—HA® or I— HA’. I— HA® can be interpreted as a theory of rules-as- 
objects. 

Note that in the description of models such as HRO, HEO, ICF we have 
to make essential use of logic: the definition of V. for example is of 
increasing logical complexity with increasing complexity of co. 


10.4.2. Technical uses 

(a) One simple application of HRO is immediate: I— HA® is a conserva- 
tive extension of HA, since interpretation of the quantifiers in I— HA® as 
ranging over the elements of HRO leaves formulae of HA (essentially) 
unchanged. (Similarly, E-HA® where equality between higher type 
objects is defined by x” = y°" = Vz? (xz = yz) is a conservative exten- 
sion of HA, as may be seen with the help of HEO.) 

(b) The preceding application still referred explicitly to non-extensional 
equality in the statement of its result. More convincing are results not 
referring to non-extensional equality; such results can be obtained by 
combination with functional interpretations (modified realizability, Dialec- 
tica interpretation). One such example is given in 11.7.3; for other 
examples, see TROELSTRA [1975], §3 and the references given there. 

The crucial fact on which these applications rest is that schemata such as 
CT, CONT, have a functional interpretation only if we permit non- 
extensional models for the finite type structure. 


10.5. Bar-recursive functionals 


The so-called bar-recursive functionals have attracted much attention 
ever since they were introduced in Spector [1962]. For a survey, see 
KreIseL [1968]. 

Here our principal reason for including them is to show how to obtain a 
Dialectica interpretation for (a part of) classical analysis (cf. 11.8); this in 
turn will be used in discussing the constructivization of classical theorems 
in Section 12. We cannot show all the steps of the way, but we intend to 
give the reader an idea of the route to be followed; the details may be 
completed with the help of easily accessible sources. 

Let us extend N-HA” by addition of types a” for finite sequences of 
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objects of type a, with corresponding variables, and some obvious axioms 
stating properties of the length function, concatenation etc. The result is a 
definitional extension, since we can identify the objects of type a7” with 
special sequences of type (0)a, e.g. as follows. 

Code natural numbers into higher types, putting mp =n, Mor = Ha. .n, 3 
then n, codes n in type a. We then code (xf,...,x2-1) as z©” with 


2 (0) = ue, 2G +1)=x? fori<u, 
zi =0, fori>u. 

Now let c be a variable of type a”, ranging over finite sequences of objects 
of type a, and let us adopt notations similar to those for finite sequences of 
natural numbers, e.g. c,* C2, @ =(u) (u Ea), Ith(c). 

Let [c] denote a sequence of type (0)o where if c = (uo,..., Ux-1); 
[c](i) =u; for i<x and [c](i)=0. for i=x. 

We note, for reference below, that we can define a A-abstraction 
operator of type o by induction on the construction of terms by 

Ax’. x? = Dg. eicso. TT o.oo TT oc, 


Ax’.t’ =J1,,,t" for x not occurring in ¢, 
AX’. tht = d(Ax’. t))(Ax?. t). 


The bar-recursion constants B (rank o, level 7) then should satisfy the 
defining axioms 


y[c] <lth(c)— Biyzuc = ze, 
BR, y[c] =Ith(c)— Biyzuc = u(Av. Beyzu(c * 6))c, 
with z E(a")r, y€((0)c)0, u €((a)T)(o")r. 


To see (at least classically) that BR, defines a functional, y must be 
assumed to be continuous, i.e. yz depending on an initial segment of z 
only; then for y[c]<Ith(c), BSyzuc is determined directly, and for 
y[c] = Ilth(c) the computation of B¢yzuc is thrown back on the computa- 
tion of BZyzu(c*6) for all v of type o. If the set of c such that 
y[c] = Ith(c) constitutes a well-founded tree (classically a consequence of 
continuity) the computation is finally reduced to cases with y[c] < Ith(c). 

We shall often simply write B, for Bt. BR= U.erBRo. 


10.5.1. DeFIniTion. BI, denotes the schema 


(1) 0 (2) (3) (4)> QC )), 


where 
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(1) Wz" 3x P(zx), 
(2) Wece'(Pc > P(c *c’)), 
(3) We (Pc > Qc), 
(4) Ve WuQ(c*u)— Qc). 
(BI therefore corresponds to Bl.) 


10.5.2. THEoREM. ECF(%) for a universe U satisfying Bly is a model for 
BRo, BR, i.e. contains elements ({[B5], 0’), ([Bi], 0”) for appropriate o', a” 
which satisfy the equations for bar-recursion of types 0 and 1. 


PRrooF (outline). One can show either directly, or with the help of a 
recursion theorem analogue (TROELSTRA [1973a] 1.9.16, 2.9.10) the exis- 
tence of constant (primitive recursive) functions &, €, satisfying the 
equations for BRo, BR;, but with ~ instead of =; to show that e, © Wi- 
where i* is the type of Bj for i = 0,1 (which would justify the choice of ¢; 
for [Bi]) we need Blo, BI, respectively. As shown in Howarp and KREISEL 
[1966], Theorem 7B, BI, can be obtained from Bly; the instances of the 
“strong axiom of continuity” needed for the proof hold trivially in the 
special cases of BI, needed here. O 


11. The Dialectica interpretation 


11.1. Some conventions and notations 


We shall use 2, p, 3, u, X, %),... for sequences of variables of finite type, 
s,t,6,& for sequences of terms of finite type. When s=(5si,...,5n); 
Si E(n)- . (Tm ) Gi, t= (ti, ..-5 tm)» t; Ee Tis then 


st = (Silie ++ bmy-+-) Sati’? tn): 


For t empty, st = s; for s empty, st is empty. Ws abbreviates Vs, ---VWs,, Vst 
abbreviates WsVt; similarly for ds and st. 


11.2. Definition of the translation 


To each formula Az of the language N- HA” we associate another 
formula A” =4zVy Ap (z, 9,3), Av quantifier-free, as follows. The types 
of z, » depend on the logical structure of A only; all the free variables of 
A” are contained among the free variables of A. 

d(i) If A is prime, then A? =Ap =A. 
For the other clauses, let A? =3zVp An (2,9), B? =AuVo Bp (u, v). 
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d(ii) (A , B)? =JzuVyv[A a B]p = dzu Vyv[ Ap (2,9) A Bo (u, d)]. 
d(iii) (A v B)? =3z°xuVyv[A v Blo = 
= 3z°xuVyv[(z = 0 An(z, 9)) A(z 4 0 Bp (u, d))]. 
d(iv) (Az Az)? =43zzVn(43zAz)p = AzeVy An {Z, 9, Z). 
d(v) (Wz Az)? =3¥% Vzyn(WzAz)p = AE Vz Ap (Xz, 9, z). 
d(vi) We split the construction of (A > B)” into a number of smaller 
steps: 


(A > B)? =[32Vn Ap > AuVd Bp]? = (a) 
=[Vz(Wn Ap > JuVv Bp)|”? = (b) 
= (Vz 3u (Vy Ap > Vv By)? = (c) 
=[Vz4duVv(Vn Ap > Bp)? = (d) 
=([Vz3uVv3an(Ap > Bp)]”? = (e) 
= [A12) Vzw (Ap (2, YJz0) > Bp (Uz, v))]. (f) 


(A > B)p = An (Ez, 220) > Bp (Uz, v). 


11.2.1. Note: with classical logic and AC, A” =A for all A, since (even 
intuitionistically) A” =A for prime A, (AnB)? @(A" 7B”), 
(A v BY? eA? vB”, (AzA)? 242A”; with AC (WzA)? oVzA?, 
and with AC and classical logic (A > B)”? ~(A” > B”) as shown by 
inspection of the steps (a)-(f). This result will be refined below in 11.5-11.6. 


11.2.2. Note also that 

(i) if A =4zVyB, B quantifier-free, then A? =A; 

(ii) for quantifier-free B, (7732 B)? =(7Vz—B)? =3z274B; 
for the extensions and subsystems of N— HA® which interest us quantifier- 
free formulae are decidable, hence (1-32 B)” < 32 B in such cases. 


11.2.3. The motivation for the definition of (A — B)” may be based on the 
following two principles: (a) to establish 43x Ax—dAy By, produce a 
functional Y computing the y from the x, i.e. Ax — B(Yx); (b) to establish 
Vx Ax —Vy By, read it as Jy “By ~Ax Ax, and apply (a); with 
another contraposition A(Xy)—> By. 


11.3. DEFINITION. Let WE-HA” be the system similar to N— HA”, but 
with only =, as a primitive, =, being defined hereditarily by 


xO = yO" = Wx" (xz = yz) 
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and with a rule of extensionality 


bP tX,+° 3 Xn = S105 Xny FAt >+P—As, 
where x,,...,X, are such that tx,--+x,, 5X,°**X, are numerical terms, P a 
propositional combination of prime formulae (i.e. equations between terms 
of type 0). 
qf-WE — HA”, qf-I— HA® are the quantifier-free parts of WE —HA%, 
I- HA®, with induction replaced by the rule (for quantifier-free A) 


+ AO, +t Ax > A(Sx) > | Ax. 


11.4.1. SOUNDNEss THEOREM. For H = WE-HA® or I-HA®: if Ht A, 
then for a suitable sequence of terms t, 


qf-Ht Vy Ap (t, p). 


For most applications, the following weaker corollary suffices: 


11.4.2. CoroLtary. For H=WE-HA*, I-HA®*: if HEA, then 
H+ Wp Ap(t,») for a suitable sequence of terms t. 


Proor. The proof is by induction on the length of derivations in H. Most 
cases are routine, except the verification for A > A a A and induction. If 
we are only interested in establishing the corollary, induction becomes 
straightforward too. So let us restrict attention to A > A aA. Assume 
A? =4zVWnAp; 


[A>A AA]? = 
= 32) EE'Vzy'y" [An (z, Yzyn'n") > Av (£z, 9’) A Av (42, 9")). 


To find a solution, at a first try take Ax.z for X’ and &, and look for a 2) 
with the following properties: if Ap (z,9’) is false, Ap (z, 2)zy’p") must be 
false, which can be achieved taking 2)zn’y” = 9’ in this case; if An (z, p’) is 
true, we want Yzxn’n’= yn". Since the prime formulae in I- HA® and 
WE — HA” are decidable, and equality functionals are available, we can 
construct for each quantifier-free B3 a term Ts such that | Ts3 =0< B3; 
thus we may take for 2) the term & defined by 


yn if Ts,20' £0, 
Tzy’'y” = 
yn’ if T.,29 =0. O 


The use of the decidability of prime formulae is essential here, as shown 
by an example due to W.A. Howard (cf. e.g. TROELSTRA [1973a], 2.7.8, 
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3.5.6): all functionals of N— HA® are continuous, but the interpretation of 
Vy" AVuom(u =0< y’ = z”) by a continuous functional would imply 


11.5. Lemma. Let IP$, M' be the schemata 

IPG (Vz Az— dy By)— Ay (Wz Az— By), 
M’ 1732 Ar—> 4% Az 

(A quantifier-free, y not free in A). 


(i) For H=I-HA*, WE-HAY®, F an instance of IPs, M’, or AC, 
F°? = AnVz Fo (x,9), there is a sequence t such that Ht Vz Fp (t, 2). 
(ii) H+ IP}+M’'+ACFA @A? for all A. 


ProoF. (i) is straightforward. (ii) is proved by induction on the complexity 
of A. The only case requiring some attention occurs when proving 
(A > B)(A-—>B)? on assumption of A@A”, BB”. We refer to 
the formulae between brackets in the definition of [A—B]? as 
(a), (b),..., (f} respectively. The transition from A — B to (a) is permitted 
by the induction hypothesis, the transition from (a) to (b) by intuitionistic 
logic, the transition from (b) to (c) requires IPs, (c) to (d) is permitted by 
intuitionistic logic again, and the transition from (d) to (e) requires M’, 
since 


(Wy Ap > Bp)? Bp V(7 Bp ATV Ap) @ 
@ Bp v(7 Bp ATM 3ANT AD) oO 
< (with help of M’) Bp v(7 Bp AdQNT Ad) 
< 3n(Bp v (4 Bp A Ap)) 39 (Av > Bp). 


The transition from (e) to (f) is justified by AC. O 


11.6. CHARACTERIZATION THEOREM. For H=I-—HA*, WE-HA?®: 
(i) H+M°+IPS+ACHKA @A?, 
(ii) H+M”"+IPS+ACKA © HEA?®. 

Here M”, IPo are the schemata 


M° Vz(A v NA)AT732A > 3A, 
IP? Vz(A v TA)aA (WEA > Jn B)— 30 (V2 A — B). 
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Proor. (i) and (ii) hold with IPs, M’ instead of IP’, M°. On the one hand, in 
WE — HA”, I- HA® quantifier-free formulae are provably decidable (for 
WE — HA” since Wx°y° (x = y v x# y) is a theorem of HA, and for I- HA®” 
because of this and the presence of FE, (cf. 10.4.1)), so IPs, M’ are 
consequences of M”®, IP>. On the other hand, IP¢, M® are derivable from 
IPj, M’, AC since by AC 


VWzx(A v TA) 4aZ Vz ((Zt =0—> A) a(Z2 #0 TF A)) 
and thus, replacing A by Zz=0, M® and IPs reduce to M’, IPs. O 


11.7. Applications of the Dialectica interpretation 


11.7.1. Foundational applications 

The Dialectica interpretation and translation were introduced in GODEL 
[1958]. The original aim was to prove a consistency proof for intuitionistic 
arithmetic and hence (via the negative translation, see 3.8-3.10) for 
classical arithmetic, by elimination of logical operators in favour of higher 
type objects; the notion of a lawlike (constructive) operation of higher type 
(LO in 10.3) was regarded as intuitively clear and a legitimate extension of 
finitistic concepts. 

Tait [1967] supplemented this by a proof that every closed term of type 0 
in Gédel’s calculus (corresponding to our qf-WE — HA” or to qf-I—- HA®) 
could be evaluated, yielding a number as value; but in his proof he needed 
induction over arithmetically defined predicates of unbounded complexity. 
In Howarp [1970], and Hinata [1967] it was shown how to replace this by 
an ordinal assignment of ordinals < €, to closed terms and quantifier-free 
£o-induction — thus ultimately achieving the same proof-theoretic reduc- 
tion as in the Gentzen consistency proof for arithmetic. 

As a consequence it remains debatable whether a real reduction has 
been achieved, depending on one’s view of what is intuitively evident. It is 
worth noting, however, that for systems such as WE — HA’, I— HA® the 
interpretation shows how logical complexity can be “absorbed” by the 
complexity due to higher type objects. 


11.7.2. Technical application: Markov’s rule 
One of the best-known and most useful applications of the Dialectica 
translation is to establish closure under Markov’s rule: 


MR FWx? (A vA), ka 4x°A D>tTAx’A 


for intuitionistic formal systems (see 11.9.(i)). For classical systems H* 
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which are conservative over the corresponding intuitionistic system H w.r.t. 
negative formulae (such as HA, cf. 3.8-3.10) closure under MR implies that 
H¢ is conservative over H w.r.t IT$-sentences 


Htan4x°A OHH AVx° AA © HEAVXx° AA > HEAP A. 


As noted in 9.1, this can be used to justify intuitionistically metamathemat- 
ical results obtained via classical reasoning on Kripke models. Other 
derived rules are given in 11.9 below. 


11.7.3. Technical application: conservative extension results with the help 
of non-extensional models 

We use a non-extensional model for I— HA®, namely HRO, to show that 
HA + M+ IP, + CT) is conservative w.r.t. prenex formulae of HA. Let e.g. 
A =Vx, dy Vx2 B(x, yi,X2,y2), B  quantifier-free, HA+M+IP.+ 
CTot A. 

We observe that (CT.)” holds in HRO, that is to say, for any instance F 
of CTy [F? Juro is provable in HA. Combining this with the Soundness 
Theorem (11.4.1) we have HAF [A ? ]uro; and since HAT [A ”Juro—> A for 
prenex A, HATA. 


11.7.4. Constructivization of classical theorems and proofs (See Section 12.) 
In preparation for this application, we discuss in 11.8 the extension of the 
Dialectica interpretation to analysis. 


11.8. Extensions to analysis 


We now extend WE — HA” to a system BR, adding the rule BR,. From 
Howarbp [1968] for example, we can extract the following extension of the 
Soundness Theorem: 


11.8.1. THEOREM. 
BR, +AC+BI,+M*+IPSFA @ BR,FA”. 
(BI, defined as in 10.5.) 
11.8.2. Derinition. Ai-comprehension (Aj-CA) is the following com- 
prehension schema: 
Ai-CA Vy [Va 4x A (a,x, y)< 3B Vz B(B,z,y)]—> 
— dy Vy [yy =0-Va Ax A(a,x, y)] 
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(A, B quantifier-free). Essentially-Aj-comprehension (ess. Ai-CA) is for- 
mulated similarly, but with Wadx replaced by a_ string 
Va, 3x, Va, 4x.°--Va, dx, and VB Az by 4B, Vz,--- 3B, V2Zn. In Aji-CA 
and ess. A!-CA parameters may be present. 

Note that in the presence of V-ACy, (ACo restricted to purely universal 
formulae) A}-CA is equivalent to ess. Ai-CA. 


11.8.3. Lemma. W-CAo, implies Aj-CA classically. 


Proor. Classically, with 
V =Vy [Va Ax A(a, x, y) 4B Vz B(B, z, y)] 
(A, B quantifier-free) we have 
V-Vy du [u =0eVa Ax A(a,x, y)} 
or equivalently 
V—Vy du [(u4 0 da Vx TA (a, x, y)) a (u =0—> SB Vz B(B, z, y))] 
hence 
V—-Vy du JaB Vxz [((uxO-> FA (a,x, y)) a(u =0—> B(B, z, y))] 
etc. O 


11.8.4. THEOREM. Let F be an instance of W-ACo. Then if F’ is the negative 
Gédel translation of F (cf. 3.8), BR: } (F’)”. 


Proor. See Howarp [1968]. O 
11.8.5. Coro.iary. EL‘+ ess. Ai-CAt A > BR,+(A’)”. 


Note that ess. A!-CA includes arithmetical and hyperarithmetical com- 
prehension (as is easily seen by quantifier manipulations). 


11.8.6. REMARK. Howarp [1968] in fact establishes much more than is 
stated in the theorem; full classical analysis can be Dialectica interpreted 
(via the negative translation) in BR = U,<+BR,. Howard’s treatment is an 
improvement of Spector [1962]. 


11.9. THEOREM (derived rules as applications of the Dialectica translation). 
(i) In WE— HA® or BR, MR holds 
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MR bVx" (A vA), b174x°A > FAX’A. 
(ii) Let A be prenex, B quanitifier-free. Then 
HAS} A > Vu 30B(u, v) > HAF A > Vu AvBUy, v). 


_ (iii) Let C= A >VWu dv B(u, v) be as in (ii), but with numerical V in A 
only; then 


EL‘ + ess. A|-CAtC > EL+BI,+ AC) + C. 


(iv) If A is purely universal, then the conclusion of (iii) can be reinforced 
to IDB,+ C (IDB, defined as in 6.20). 


Proor. (i) Let A contain an additional parameter y” besides x’. Assume 
KVx’(A v A), }7—74x7A, and let us assume the choice rule 
ACR KWx? Sy"A(x,y) > FWx7A (x, tx) 


to be established for H (this can be done via mgq-realizability, see 
TROELSTRA [1973a], 3.7.2(ii)). Then for a_ suitable ¢ (since 
Wx° (A vA) OVX? Az ((z =0NA)V(z40A7A))) 


Fixy =O0<Axy 
and hence + 7 44x? (txy = 0). By remark (ii) in 11.2.2, 
F(T 43x? (txy =0))? 


implies | dx? (txy =0), hence # 3x7 A. 
(ii) Assume for notational simplicity that A =VWx dyC(x,y), C 
quantifier-free, and let 


HAS+ A > Vu AvB(y, v). 
With the negative translation 


HAt Vx M7 3y C(x, y) > Vu N77 AvB(y, v), 


hence 
HAI Vx dyC(x, y) > Vu 47 3vB(y, v), 


and so 
EL + ACy} C(x, ax) > Vu 2740 B(y, v) 
and therefore by the Dialectica interpretation 
WE - HA’ t C(x, ax) > Vu dv B(y, v). 
Using ECF as a model for WE —- HA”: 
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ELt C(x, ax) > Vu Av Buy, v), 
hence 
EL + ACy | Vx ay C(x, y) > Vu Av B(y, v). 


By the result of GoopMan [1968] or Minc [1975] (see 3.4) it follows that 
HA}! Vx dy C(x, y) > Vu Av B(y, v). 

(iii) Very similar: now use the Dialectica interpretation in WE — HA® + 
BR, + BR, and take as a model ECF(%), & satisfying EL + Bl); then use 
11.8, 10.5. 

(iv) From (iii) with the lemma in 6.22. 0 


11.10. Further extensions, references 


The most important extension of the Dialectica interpretation as given 
here is the extension to second-order arithmetic and finite type theory with 
set variables in GiraArD [1971, 1973]; a brief account is in TROELSTRA 
(1973a], 1.9.27, 3.5.21. 

For a variant not requiring decidability of prime formulae in the proof of 
the Soundness Theorem see Ditter and NAuM [1974]. 


12. Local and global constructivizations of classical theorems 


12.1. “Local” and “global’’ 


From a logicians viewpoint it is certainly natural to ask whether there are 
general procedures which transform classical results with classical proofs 
into constructive results with constructive proofs. For example, such a 
transformation procedure should be applicable to all statements provable 
in a given classical system H; after transformation of a theorem A with its 
proof in H, the result should be a theorem A * with a constructive proof in 
a constructively justified system H*. 

Such a method of constructivization is here called’ “‘global’’, in contrast 
to ‘“‘local’’ constructivizations, i.e. ad hoc constructive versions of certain 
classical theorems which are obtained by exploiting the specific data of 
each theorem studied. 

It stands to reason that local constructivizations, obtained utilizing the 
specific data of a given problem, may be expected to yield better results, in 
general, than global methods applied to the same problem. 

There are two reasons to be interested in global constructivizations: 


1 “Global” is sometimes described as ‘“‘trivial’”’ (because obtainable by a standard method). 
Of course, it is not always trivial to recognize something as “trivial”! 
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(1) by comparison with results obtained by global methods, one obtains 
a measure for the improvements obtained by local constructivizations; 

(2) even where the global constructivizations are not optimal, they can 
be very useful as auxiliary results (in obtaining local constructivizations of 
other theorems). 


12.2. The Dialectica interpretation provides us with an example of a global 
constructivization method. Most of our examples given below depend on 
the following. The bulk of theorems in classical analysis uses only weak 
forms of comprehension. Assume therefore A to be a theorem such that 


EL‘ + ess. Aj-CAEF A. 


If A is of the form C—Vu AvB, C prenex, B quantifier-free we may 
appeal to 11.9(iii) and obtain 


EL + BI,+ ACo F A, 
and if C is purely universal, with 11.9(iv), 
IDB, A. 


Many theorems are in fact of the form C > Vu 3vB, C purely universal, 
B quantifier-free, and all we have to do in applying the method is to verify, 
for a given statement, that it has the required syntactic form and that the 
proof does not require more than ess. Ai-CA. For theorems which are not 
of the required form, it is often quite easy to find a slightly weaker or a 
classically equivalent statement which does have the required form; in fact, 
the Dialectica translation can be used to discover a suitable reformulation, 
see e.g. 12.4. 

Up till now, the Dialectica interpretation and the closely related 
no-counterexample interpretation (see Chapter D.2) are the only success- 
ful methods for global constructivization. It is easy to see why the negative 
translation is unsatisfactory as a method of constructivization: although it 
establishes that classical theorems formulated in the negative fragment 
may be regarded as intuitionistic theorems, the method is uninteresting 
inasmuch the translation does not yield statements with constructive 4 or 
v, i.e. the operations which are interesting from the viewpoint of (naive) 
constructivism are lacking (but cf. GEL’FOND [1972]). 

Neither is realizability or modified realizability of any use, since these 
interpretations do not realize all instances of A v 4A; and first applying 
the negative translation, then realizability does not help either, since 
realizability leaves negative formulae essentially unchanged. In contrast, 
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the Dialectica interpretation applied to negative statements introduces 
(constructive) 4. 

For another possibility of global constructivization, largely unexplored 
as yet, see TROELSTRA [1977a], 6.12. 


12.3. Bolzano—Weierstrasz or the bisection argument 


We shall assume (classical) analysis of reals and real-valued functions to 
be formalized in finite-type theory — so sets of natural numbers are 
represented by their characteristic functions of type 1, sets of reals by type 
2 functionals etc. 

The Bolzano-Weierstrasz theorem, and similarly the assertion that a 
bounded sequence of reals has a least upper bound is usually proved by 
means of a bisection argument which, on inspection, turns out to be 
formalizable with the help of arithmetic comprehension. For the existence 
of the I.u.b. we only have to prove the theorem for sequences of rationals, 
as one easily sees — therefore we restrict attention to this case. Let (ran) n 
enumerate a sequence of rationals contained in [0, 1). Then 

Vk Alm <2*[An (fon € [m -2-*,(m +1)-2°*])a 
ATAN (fan = (m + 1)-2-*)). 
Arithmetic comprehension yields a B such that 
Wk (An (ren € [Bk -27*, (Bk + 1)-2-* J) A TAN (Fan = (BK + 1)-2°*)), 


and (8k -2™“), is an r.n.g. representing the required l.u.b. 
Similarly for the Bolzano—Weierstrasz theorem. 


Remark. In an unpublished manuscript, TAKEUTI [1972] shows that classi- 
cal analysis can in fact be developed in a conservative extension of Peano 
arithmetic, obtained by introducing sets of all finite types, with arithmetical 
comprehension without parameters. 
12.4, The intermediate-value theorem and Brouwer’s fixed point theorem 
We consider the following special case of the intermediate-value 

theorem: 

Every uniformly continuous real-valued function f on [0, 1] 
(*) with f(0) = —1, f(1)= +1 has an argument x € [0,1] such 

that f(x) =0. 
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A well-known classical proof proceeds as follows. Consider the set 
X ={r,: Vr =r, (f(r) <0)} (r a variable ranging over rationals). X is a 
bounded non-empty subset of the rationals in [0, 1], therefore has a 1.u.b. 
x <1; because of continuity arguments, f(x)=<0; but f(x) <0 is excluded 
(again because of the continuity of f) hence f{x)=0. 

Now consider the following weakening of (*): 


Every uniformly continuous real-valued f on [0, 1] with 
(**) f(0)= —1, f(_) = +1 has, for every k, an argument x € [0, 1] 
such that | f(x)|<2™. 


(An application of Bolzano—Weierstrasz to (**) would yield (*).) 

This can be formulated in finite-type language as follows. Let ® be such 
that for any @ (riayn)n iS the corresponding r.n.g. in the standard 
representation of [0,1] (cf. 5.13); note that 


| Xa = Teal =27"), 
Let f be a uniformly continuous, real-valued function; f can be coded by a, 


B such that (writing a, for (@).) (riea,yndn Tepresents f(r,), B acts as a 
modulus of uniform continuity: 


Vamk OS 7, 21A0S%mS1A| IA — tm | <2 > 
(1) 
> | F(oan +2) — M(@am xk +2)| < 2). 
The conditions f(0)= —1, f(1)= +1 are expressed by (assuming 0 = 1, 
1=r, for simplicity) 


Vk ([re@aone + 1] << 2°-*"), VK (| reese — 1) < 2-**'), (2) 
The conclusion of our statement can be formulated as 
Wk An (| roan! < 2°“). (3) 
Therefore 
(1) 4 (2)>@) 


can be rewritten in the form Vx A ~Wu AvB(u,v), A and B quantifier- 
free, and this statement has been established in EL‘ + ess. Ai-CA, and thus 
by 11.9(iv), 

IDB, | (1) a (2)— (3). 


Now this result is fairly trivial and can also be easily obtained directly (in 
fact, EL + AC, suffices), and is practically equivalent to (a weakened form 
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of) Brouwer’s fixed point theorem in one dimension (cf. Remark 5.14). It is 
more interesting to note that exactly the same treatment applies to the 
following weakening of Brouwer’s fixed point theorem in more dimen- 
sions: 


If f is a uniformly continuous mapping from I” into I” 
(I =(0, 1]), then Vk Ax € I" (p(x, fx)<2™“), where p is a 


Euclidean metric. 


For a classical proof of Brouwer’s fixed point theorem, see e.g. ENGELKING 
[1968], pp. 296-304. 

Let us now reconsider the constructive version of the intermediate-value 
theorem and show (a) how the constructive formulation can be found via 
the Dialectica translation and (b) how we can obtain some improvement 
using the Dialectica interpretation directly, not via the conservative 
extension result of 11.9. 

The intermediate-value theorem in its classical form (*) can be expressed 
in the language of finite types as 


Vy°Cy>7Vy AVX’ A(y,x) (4) 


where Wy°Cy expresses (1)A(2) (C quantifier-free), and A(y,x)= 
|<2-"*?. The Dialectica translation of (4) takes the form 


dy Y VX (Cy > A(YX, X(YX))) 


| V (Paay\pny)n 


which implies 
Vy’ Cy > FY WXA(YX, X(YX)). (5) 
Specializing X to Ay°®.x°*: 
Vy’ Cy ~AYVx° A(Y(Ay.x), x), 
and weakening 
Vy°Cy oVx°AyA(y,x) 


which is equivalent to (3). 
On the other hand, weakening (5) to 


Vy’ Cy WX AyA(y, Xv) 


and interpreting X as an (extensional) continuous functional of type 2 
(acting on number-theoretic functions corresponding to r.n.g.’s), we obtain 
in EL+ BI, + ACo:: 
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For a uniformly continuous f on (0, 1], fO = — 1, f1=1, and 
for y coding an r.n.g., X a continuous type 2 functional, 
Vy? Cy WX? By (fry |< (X7)") 


(KrEIseEL [1973]). A similar improvement is obtainable for Brouwer’s fixed 
point theorem. 


12.5. Weierstrasz’ approximation theorem 


This can be stated as follows. 


Every uniformly continuous function on [0, 1] can be 
approximated within 2“ by a polynomial with rational (1) 
coefficients, for all k. 
Let us assume polynomials with rational coefficients to be coded onto the 
natural numbers; let us write P, for the polynomial with code k. Let ¢ be 


the modulus of uniform continuity (on [0, 1}) for P,; we write o(k,n) for 
gyn. Then 


Vx EIVy EI (|x -—y|<2°°%” >| Pix — Pry | <2"). 


Assume f to be a uniformly continuous function coded by a, B; we put yx, 
for the (a2), with r, =k.2-" (Qk =2") and write 


g(k, n,m ) = FO de. nm ¢(B, k, n) = max{Bn, ¢ (k, n)}. 
Note that | f(k.2°")— g(k,n, m)|<2°"*'. It is now easy to verify that 


Vk < 2 eRe Te (ks ¢(B, L nt 2), nt 3)- P, (k 7 2 eGint2))| < 2) (2) 
implies 
Wx El (|fx -— Px| <2") (3) 


and that conversely (3) for n + 3 implies (2). Therefore, if we let Vy° Cy 
stand for (1) of 12.3, we can formulate the approximation theorem as 


Vy°Cy > Vn 312), 


which is of the required form. 
The reader may compare this way of obtaining the Weierstrasz approxi- 
mation theorem with the direct route in BisHop [1967], p. 100. 
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12.6. Rolle’s theorem 


Another rather simple example is provided by the following weakening 
of Rolle’s theorem: 


If f is defined on [0, 1] and has a uniformly continuous 
(*) derivative f’ on [0,1], and f0 = f1 =0, then for every k 


there is an x € [0,1] such that | f’x|<2™*. 


Note that (classically as well as constructively) a function with uniformly 
continuous derivative is itself uniformly continuous. Thus f, f’ are coded by 
pairs a, B and «@', B’ respectively; f0 =0,f1=0 is also expressed by a 
purely universal condition; a function y regulates the convergence of 
quotients of differences to the derivative, i.e. we define ‘‘f’ is a derivative of 
f’’ by the existence of a y such that 
fxe-fy_ p 
— f'x 
x-y f 
We can make sure that (a, 8), (@’, B’) are in the relation of function and 
derivative by requiring 


<2", 


x#yalx-y|<2"%—> 


Vamkl (OS 7, S1A0 Sr, SLA A Im N| ta — tm | <2 ™* Alte — tm [> 27! 


V (dank +1) — P(@anyk +1) 
Tn — Vn 


~k+3 
= Keeage | SZ"). 


Now we can again apply our conservative extension result, which yields the 
existence of a constructive proof of Rolle’s theorem as stated in (*). 


12.7. Still further examples can be found e.g. in KReIseL [1952], where 
certain classical theorems concerning power series (in real or complex 
variables) are reformulated. 


12.8. Example of a ‘‘local’’ constructivization 


Although the weakened version of the intermediate-value theorem 
described above is sometimes useful, it is mathematically more interesting 
to find additional conditions to be added to the premiss of the 
intermediate-value theorem which permit us to maintain the original 
strong conclusion. One such rather obvious additional condition is the 
following (BisHop [1967], p. 59, Exercise 14) 
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VkIOOSn S1A0SnS1 AK <n dii(n<n<nalfil>2")) 
or, what amounts to the same, 
Vy [x<yax €[0, l]ay €[0, 1} 4z € [x y](|fz| > 0)). 


The additional condition (1) enables us to apply a standard bi-section 
argument: 

Put [xo, yo] =[0, 1]. Let 7, €[i,7] be such that fr, #0; if fr, <0, take 
[x1,y:}=[n, 1], [0,7] otherwise. Assume [x,, y,] to be constructed; let 
7, €[3xn +4Ym aXe tayn] be such that fr, #0; if fr, <0, put [xn yaoi] = 
[7 Yn], [Xa 4] otherwise. It is easy to see that ([x,, y,]). converges to a 
single point; fx, <0, fy, >0 for all n, hence for the limit x, fx =0. 

Of course, it remains to be shown that (1) is a property possessed by most 
of the ‘‘usual’’ functions. For example, one can show the following. 

If f is n+1 times uniformly differentiable, with derivatives 
fO,fO,...,f6", and C >0 is a constant such that on [a,b] (a <b) 


[fel + [fx] ++ +]f)]> C 


is satisfied, then (1) holds. (1) is also true for strictly increasing or 
decreasing functions. 
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1. Introduction 


The purpose of this chapter is to give an expository presentation of the 
correspondence between topoi and theories which makes precise 
Lawvere’s claim that “the notion of topos summarises in objective 
categorical form the essence of ‘higher-order logic’”’ (LAWVERE [1975]). 
Elementary topoi turn out to correspond to theories in a quite natural logic, 
formally an intuitionistic type theory (with the full comprehension axiom). 
The theories in question are definitionally complete theories (this notion is 
defined in Section 6) and every theory has an obvious definitional 
completion. To assert that topoi correspond to theories is not to deny that 
certain topoi may be viewed as models for our logic. Models may be 
described syntactically by ‘diagrams’’ so theories may be said to include 
models. Our point is that, in general, topoi may be viewed as theories. In 
particular, some topoi which arise semantically are better understood as 
theories than as models. Thus we regard topos theory as the ‘“‘algebraic”’ 
form of this higher-order intuitionistic logic. 

The formalization of our logic was developed in joint seminars with 
Dana Scott in 1973. The connection with topoi was first described by the 
author at an informal session of the 1974 Seminaire de Mathématiques 
Supérieures in Montréal and formed the basis of his thesis (FOURMAN 
{1974]). Formal systems corresponding to topoi have been described 
independently by other authors, notably Coste [1974] and Borteau [1975]. 
Their systems lack the main innovation of our formalization — an existence 
predicate — and they are therefore forced to make awkward restrictions on 
the rule of modus ponens. As LAwverE [1975] has observed, these 
restrictions are occasioned by a shortcoming in the traditional interpreta- 
tion of variables. We emphatically do not agree that because of this “‘the 
traditional logical way of dealing with variables ... should be abandoned”’. 
The traditional use of variables has much to commend it and there are far 
less drastic remedies to hand. 

The primary purpose of introducing an existence predicate is to formal- 
ize in a natural way the logic of partial elements (in the case of sheaves 
these are the possibly non-global sections). By modifying the interpretation 
of free variables (allowing them to range over partial elements) we may 
continue to deal with variables in the accustomed way, retaining the 
fundamental transitivity of entailment. The various formal systems corre- 
sponding to topoi are of course formally intertranslatable (via topoi if you 
like). It is thus largely a matter of taste which is preferred. 

A further aim of this chapter is to give a presentation of topoi accessible 
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to logicians. It was Lawvere’s realization that certain logical constructs (e.g. 
the formation of power-types) could be interpreted in categories of sheaves 
— Grothendieck topoi — which led to the axiomatization of elementary 
topoi (LAWVERE [1970], Lawvere and TiERNEY [1970]). We believe that a 
thorough understanding of Lawvere’s insight will enable logicians to 
exploit the many models provided by Grothendieck and his co-workers 
(see e.g. ARTIN, GROTHENDIECK and VERDIER [1972]). These should be 
useful not only in the study of pure logic but also in finding applications of 
logic extending the non-set-theoretical uses of Boolean-valued models (see 
Scott [1969], Takeuti [1977] and Rousseau [1977]). 

I am grateful to Dana Scott for many stimulating conversations and for 
his detailed and constructive criticisms of this work in all its stages. 


2. Category-theoretic preliminaries 


The concepts we require from category theory are quite elementary. We 
review them here since even in introductory texts they are often presented 
with a certain degree of sophistication. If we avoid using general notions 
such as limit and adjoint, this is not because we consider them unimpor- 
tant. For our present purposes, however, it is necessary to spell out the 
elementary properties (in terms of objects and morphisms) which are all 
too often hidden by the abstract definitions. For a more general discussion 
which exhibits the underlying unity of many seemingly diverse constructs 
see MAc Lane [1971] (especially Chapters III and IV). 

We assume that the reader knows what categories and functors 
(homomorphisms of categories) are. We shall confuse notationally an 
object A with its identity morphism A:A-—A. Given morphisms 
f:A—B and g:B—C we write the composition in the order 
gf: Ac. 


2.1. DEFINITION. A category with finite products has firstly a terminal object 
1 (the product of the empty family) with the property that there is a unique 
morphism A — 1 from each object A to it, and secondly for each pair A, B 
of objects, a product A < B equipped with projections 7, 72 such that for 
each pair of morphisms f:C—>A and g:C—B there is a unique 
morphism (f,g): C—> A X B making the diagram commute. 
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Diagram 1. 


We write h X k for (hom, kom): A X BCX D where hh: A—C and 
k:B-D. 


Remark. In a sense the concept of a category with finite products is 
equational (or essentially algebraic). The commutativity of the above 
diagram amounts to having the two equations: 
mee f, mehr s 
The uniqueness of (f,g) amounts to having the further equation 
(meh, w2°h)= h. 


To make this precise requires a consideration of algebras with partial 
operations (a category with finite products is just such an algebra, see Scotr 
[1975]). Similar remarks apply to the following concepts: category, category 
with finite limits, cartesian-closed category and topos. 


2.2. DEFINITION. A category is cartesian closed if it has finite products and 
for each pair A,B of objects, we have an exponent B“ and evaluation 
morphism 


ev:A xX B*—>B 
such that for any f: A xX X > B there is a unique f:X—>B?* with 
evoA X f =f 
(the uniqueness may be imposed equationally by (eveA x g)’=g). We 


call f the transpose of f and vice versa. 


2.3. DEFINITION. In an arbitrary category, given morphisms f,g:A—B 
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we say that e : X > A isan equalizer of f and g iff not only f>e = gee, but 
also for any h: Y—A with foeh = gch there is a unique k : Y— X with 
eck =h. In this case 


f 
X-?-AZB 
& 


is said to be an equalizer diagram. 


2.4. DEFINITION. The square (f, f’,g,g’) in the diagram is said to be a 


Diagram 2. 


pullback iff not only f°f'= gog’ but also for any pair k, h of morphisms 
such that fok = gch there is a unique morphism e such that f’°e = k and 
gice=h. 


The reader should make precise and prove the assertion that terminal 
objects, products, exponents, equalizers and pullbacks are unique up to a 
unique isomorphism. 


2.5. DEFINITION. A morphism f is a monomorphism iff whenever fog = 
feh, then g = h. Monomorphisms (briefly monos) are said to be monic. If f 
and k are monos with common codomain A we say that f = k iff f factors 
through k; that is, f = keg for some morphism g. If f= k and k =f, we 
say that f and k represent the same subobject of A. We write P(A) for the 
(partially ordered) class of subobjects of A. 


The proof of the basic facts in the following proposition is left to the 
reader. 


2.6. PRoposiTION. (1) Every equalizer is monic. 
(2) If f is monic, them (f, g) is monic. 
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(3) If 


Diagram 3. 


is a pullback and g is monic so is f’. 
(4) If both small squares are pullbacks, so is the outer rectangle. 


> 


Diagram 4. 


(5) If a category has finite products and equalizers (i.e. every pair of 
morphisms f, g : A > B has an equalizer), then it has pullbacks (i.e. from 
every pair of morphisms with common codomain we can construct a 
pullback ). 


2.7. DEFINITION. A category is finitely complete iff it has finite products and 
equalizers. (The point here is that such a category has limits for all finite 
diagrams, see Mac Lane [1971].) 


3. The logic of partial elements 


Standard formalizations of logic are valid only for non-empty domains. 
Classically we imagine we can decide whether a domain is empty or not. 
Even intuitionistically, so long as we treat only one domain we lose nothing 
by assuming it ‘‘inhabited’’. If, however, we want to consider the subdo- 
main determined by an undecidable predicate, we must say that it may be 
undecidable whether a given x exists qua element of the subdomain. Thus 
in general as soon as we deal with more than one sort we find we must 
adapt our logic to deal with domains of which we can neither decide 
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whether they are inhabited, nor whether their elements exist fully. When 
we talk about something, we cannot always presuppose its existence. 
Well-formed terms of a language may fail to denote. Many kinds of “free” 
logic have been suggested for classical systems but the problem of existence 
only becomes acute in an intuitionistic framework where we may no longer 
assert that a domain must be either inhabited or empty. 

To deal with terms which may not denote we introduce a formal 
existence predicate E. We read Er as ‘‘r exists’. One may think of free 
variables of sort A as ranging over an (implicit) outer domain of potential 
elements. (We shall see later how any domain A of partial elements may 
be represented as a subdomain of a domain A of total elements which we 
may think of objectifying the potential elements of A.) The predicate E 
then picks out the subdomain A of actual elements. We modify 
Quine’s -dictum to “to be is to be the value of a bound variable’’ by 
regarding bound variables as restricted to E. Bound variables range only 
over actual elements. 

Whenever one introduces a domain, one must also introduce a notion of 
sameness| within that domain. There is a notion of equality of partial 
elements which presupposes existence in the following sense: 


r=a0—>Er aE. 


We also introduce a notion of equivalence, essentially by considering all 
elements outside E as equivalent in their non-existence. Intuitionistically, 
we must phrase this more positively and express equivalence by 


tT=0e(Er>7r=0)\(Eo>T=<0). 


This relation is basic to our logic. We banish any possible intensional 
notions by requiring that equivalent elements be indistinguishable. Well- 
defined predicates should be extensional not only with respect to equality 
but also with respect to equivalence. This extensionality is expressed by the 


schema 
g[t/x) a7 =a p[a/x]. 


Since = is so fundamental we take = and E as primitives and define 
equality by 


tr=aert=onErnEs. 


We present the logic as a many -sorted theory with higher types. To give 
the higher-order structure it is sufficient to have a map which assigns to 
each finite sequence (Ao, ..., An-1) of sorts a sort [Ao,..., An—-1] which we 
think of as the ‘‘power sort” of all n-ary relations on the given sequence of 
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domains. As we shall see other type constructs are reducible to ideas 
defined in terms of these power sorts. The axiomatization of higher-order 
logic is simplified by the fact that it suffices to take conjunction, implication 
and universal quantification as the only primitive logical connectives: the 
other connectives are definable in terms of these. 

It was suggested by Scott that, having introduced the existence predicate, 
it is convenient and straightforward to employ definite descriptions as 
terms. We do this as it gives us a conveniently large stock of terms. The 
term Ix¢ is to be read ‘“‘the unique x such that ¢”’. A fuller discussion of 
the intuitionistic logic of partial elements and descriptions may be found in 
Scott {1977] and Fourman and Scott [1977]. We now turn to a presenta- 
tion of the formalization of this logic. 


3.1. DeFIniTION. A higher-order language is specified by the following 
data: 

(1) Two sets Sort and Const (of sorts and constants). 

(2) A power-type map from WU, ..Sort” to Sort written as 
(Ao, ---, An-1)# [Ao,..., An-i]- 

(3) A map assigning a sort to each constant, # : Const > Sort. 

Given a language, we introduce a set Var of variables. Each variable x 
has a sort #x. There are countably many variables of each sort. 


To avoid confusion we point out that we do not assume that the sorts are 
built up syntactically from a set of “‘ground-sorts” by iterating the 
power-type operation. In particular it is not assumed that [A]=[B] 
implies A = B. This ‘‘abstractness’’ poses no real problems. 


3.2. DeFinition. The sets of terms (1, o,...) and formulae (9, #,...) of a 
language are sets of expressions built up inductively according to the 
scheme opposite. Each term 7 is assigned a sort #7 and the inclusion of an 
expression is conditional on the proviso’s being satisfied. 


We leave it to the reader to define the set of free variables of a term or 
formula (FV(r) and FV(¢) respectively), and the result of substituting a 
term o for a variable x (where #0 = #x) in a term or formula, (t[o/x] 
and y[a/x] respectively). For convenience we assume that the substitution 
operation changes bound variables as necessary to avoid clashes. We make 
free use of customary abbreviations, in particular we write Vx for a finite 
string of quantifiers (including the empty string), A\ for a finite conjunction 
and yw for (p>) rA(¥> @). 
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Expression Sort Proviso 
x OX x € Var 
Terms: c #c c € Const 
Ix @ #X x € Var, » a formula 
Expression Proviso 
Er 7 aterm 
T=0 #T=#CO 
Formulae: 7(a0,..., Ga-1) #7 =[#o0,...,#On-1] 
(gp Aw) yg, W formulae 
(p> Ww) ¢y, W formulae 
Vx x € Var, » a formula 


We note here that the empty sequence of sorts gives rise to the special 
power sort[ ] which can be regarded as consisting of truth values. If 7 is a 
term of sort [ ], then 7( ) is a formula which in effect asserts the 


proposition +. 


3.3. DEFINITION. As axioms and rules we take those instance of the 
following schemata which are well-formed. 
e—>(v>¢@), 
(Propositional | (pg >(¥ > 6))>((e > 4) > (¢ > 9), 
axioms) 3 (¢ A wW)—>¢@, 
(pr W)> 4, 
(e > (b> (¢ rw); 
Axioms: (=) oly/xJay =z—>@¢[z/x], 
(E) Wx (x =yex=2z)>y =z, 
(V) Vxe rnEx—>¢, 
() Vy (y =lke ox (gx =y)), 
Comp Ely Vi(¢y(x)), 
Pred y(xX)>Eya MEx,. 


g gy g 
(MP) i; r (Sub) g[r/x]” 
Rules: 


(v*) te . where x ¢ FV(w). 
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The propositional axioms are those standard for intuitionistic logic. 
Axiom (=) is the principle of substitutivity of equivalents. The 
extensionality axiom (E) embodies the principle that for two things (y and 
z) to be equivalent it suffices that they compare equivalently with actual 
objects (those in the range of the bound variable x). The axiom (V) of 
universal instantiation permits the passage from a bound variable state- 
ment to the corresponding free variable statement given the premise of 
existence. The axiom of descriptions (1) requires the outer quantifier: an 
existing element is equivalent to the element described just in case it is the 
only (existing) element satisfying the predicate. We need not mention 
improper descriptions explicitly, their properties follow from the axiom of 
extensionality and the properties of the quantifier. The first-order axioms 
stop here. (Of course if we were just doing first-order logic we would add 
the other connectives, the existential quantifier and the ‘‘obvious” axioms 
and rules — those of Theorem 3.5.) 

To axiomatize the higher-order logic we add the full comprehension 
axiom (Comp): every predicate has an extension, a unique element of the 
appropriate power sort. We note that this axiom implies a property of 
extensionality for power sorts (Lemma 3.6(6)). The last axiom (Pred) of 
predication is more or less a grammatical convention: if we predicate a 
relation of certain arguments then both it and all its arguments should 
exist. 

The rules are straightforward: modus ponens (MP) and substitution (Sub) 
are as usual. The rule (W*) of universal generalization is stronger than the 
usual one since the premise is weakened by the existence assumption. 

The notion of derivability is defined as usual. We write [+ ¢ to mean 
that g is derivable from the set I’ of formulae. A theory is a set T of 
formulae such that if T+, then g € T. Any theory T is necessarily 
formulated in a language L(T). We shall have occasion to use varying 
languages from time to time. 

In the interests of readability we may write 


Vx:A.qg_ for Vx¢, 
ix:A.g forlxg, 
where #x = A, to make the implicit sort visible. We may also write x: A 


to express that #x = A. 


3.4. DEFINITION. Taking advantage of quantification over ‘“‘truth-values”’ 
we introduce the remainder of the traditional panoply of logical connec- 
tives by the following abbreviations: 
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(1) 
(2) 
(3) 
(4) 
(5) 


epvw forVz:[ ]((g>z( Jabez zd), 
me  forVz:[ ](e>z( )), 

axe forVz:{ ]Wx(e>z( ))>2z( )), 

T for Ely:[ J-y( ), 

1 for Vz:[ ].z( ). 


It is a simple exercise in finding formal derivations to show that these 
defined notions behave properly. The axiom and rule for the existential 
quantifier have to be modified to take account of existence (as with V). 


3.5. THEOREM. 


(1) Fe—>ovy, 

(2) tb>oevy, 

(3) (ge > b)> (80> 4) ~(¢ v 6) )), 

(4) K(e>¥)-(e > 74)> 449), 

(5) 19 >(¢> 9), 

(6) ty AEx > 43xg, 

(7) BT, 

(8) tl g, 

(9) The rule pores is valid where x is not free in w. 
3.6. LEMMA. 

(1) bFEx @dy.x=y, 

(2) FElxg Ay Vx (pox =y), 

(3) bz =lxp oy (y =z oVx (x =y oo), 

(4) b2z(...,Ixo,...)@dy (z(...,y,...)A Vx (gp ox =y)), 
(5) Fixge(-::)oady (y(-:)aVx(eox=y)), 

(6) EWy, z:[Ao,...,An-i](y = z VE (y(X)  z(¥))), 

(7) Fx =yovz:[A](z(x)z(y)), 

(8) IVx,y(x=yeox=y), 


(9) 


FEIxg > ¢[Ix¢/x]. 
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These logical truths indicate how existence, descriptions, equivalence 
and equality are related. As a consequence we have the following 
labour-saving lemma remarked by Scott: 


3.7. Lemma. Every formula ¢ of higher-order logic is logically equivalent to 
a description-free one. 


Proor. By induction on the structure of @ using (2)-(5) of 3.6 to eliminate 
descriptions from atomic formulae. O 


The sorts given ab initio may be rather special and supply only a 
selection of the domains generally required in mathematics. However their 
higher-order structure makes them rich in subdomains. We define these 
more general (and more useful) types to be certain syntactic objects of our 
languages. 


3.8. DEFINITION. A type A is a term of the = syntactic form 
Iy :[A]. A(e  y(x)) (which we abbreviate as {x : A | ¢}). We say A is 
definable iff it is a closed term. 


Each type A = {x : A | ¢} is to be regarded as determining a subdomain 
of the sort A. (We could not define a type to be any term of sort [A] since 
for different A, B it might occur that [A ] = [B], and so the term alone does 
not determine the underlying sort. However given any term 7 of sort [A] 
there is an obvious associated type {x: A | +(x)}.) For A = {x:A |} we 
use the notations r€ A; Wx € A; dx EA with their obvious meanings, 
where r and x are of sort A. 


3.9. DEFINITION. A (definable) relation F from A to B isa (closed) term of 
the form 


Iz :[A, B].Vx:A,y: B(z(x, y)o¢). 


(Again the syntactical form of F carries with it information on A and B.) 
We do not need to know that a relation F from A to B is (the graph of) a 
partial function to employ the ordinary function-value notation 


F’(r) forly:B.F(z,y), 


where 7 is a term of sort A. If o is a term of sort B and x: A we define 
functional abstraction by abbreviation, writing 


Ax:A.o forlz:[A,B].Vx,y[z(x, y)oy =a). 
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3.10. LEMMA. The usual conversion principles hold: 

(1) Wx: A(x E{x:A]o}og), 

(2) FKWx:A ((Ax:A.o)(x) =o). 


4. Categories from logic 


Our eventual aim is to show how topoi correspond to theories in our 
logic. Here we look at some categories arising from such theories and 
examine their structure in terms of the concepts introduced in Section 2. To 
construct a category from a theory T we take as objects a collection of 
definable types and as morphisms take the definable total functions be- 
tween these types. The idea of constructing such syntactic categories (due 
to A. Joyal) has been exploited by Reyes [1975] and others. 


4.1. DeFiniTion. The category C(T) of sorts and definable total functions of 
T has as objects the sorts of L(T) and as morphisms from A to B 
equivalence classes of definable relations F from A to B such that 


TtWx:A.EF’(x), 
where F and G are equivalent iff 
THWx : A(F’(x)=G’(x)). 
The composition F°G is given (when it is defined) by the term 
Ax. F’(G’(x)). 
Checking that C(T) is a category is easy. We leave it to the reader. In 


general, the following construction gives a more interesting category. 


4.2. DEFINITION. The category E(T) of definable types and definable total 
functions of T has the definable types of L(T) as objects and as morphisms 
from A to B equivalence classes of definable relations F from A to B such 
that 


THVx EA. F(x)EB, 
where F and G are equivalent iff 
TtVx €A.F’'(x)=G'(x). 


The composition of F and G is defined as in the previous definition. 
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Again it is simple to see that we have a category. The following lemmas 
examine its structure. The proofs are just syntactic versions of the proof 
that the classical category Sets has the corresponding structure. We sketch 
the necessary constructions leaving the details to the reader. 


4.3. Lemma. E(T) has finite products. 


Proor. {z :{ }/ z¢ }} acts as a terminal object since 
FEly.ye{z:[ J[z( )}- 
If 7: A and o:B we write (7,¢@) for 
Iz :[A, B].Vx,y(zZ@Q, yhox=Trag=y). 

The product A X B is given by 

{z :[A, B]| ax EA, y €B.z = (x, y)} 
with the projections exemplified by 

Az. Ix. dy.z = (x, y) 

from AXBto A. O 


In future we shall write A(x, y).7 for 


Az. |lw.4x, y (z = (x,y) AT = w). 


4.4, Lemma. &(T) has finite limits. 


Proor. Let F,G:A—B in &(T). The equalizer of F and G has as its 
domain 


{x:A]x EA AF(x)= G(x)} 


and is represented by Ax.x. O 
4.5. LEMMA. E(T) is Cartesian closed. 


Proor. The exponent B“ is given by 
{z :[A, B]|Vx EA.Ely € B.z(x,y)aVx,y (z(x,y) > x ECAny EB} 
with the evaluation morphism given by 


A(x, Zz). ly. z(x, y). 
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For F: A X X—B the morphism F : X > B“ is given by 
dw. Iz :[A, B].Vx, y (20% yoy = F’((x,w))). O 


4.6. LEMMA. A morphism F : A > B in E(T) is a monomorphism iff 
THVx,yEA(F'(x)=F'(y)>x =). 
Proor. Let F be a monomorphism: We have two morphisms 
G=A(x,y).x and H=A(x,y).y 
(the projections) from 
E ={z:[A,A}|ax,y GA (z = (x,y) 0 F'(x) = F'(y))} 


to A. It is easy to see that Fe G = Fo H henceas F is monic G = H. That is 
to say, 


TrVz €E(G'(z)=H'(z)) 
whence 


THVx,y EA (F(x) =F'(y)>x =y). 
The proof in the other direction is easier still. O 


4.7. LEMMA. A commuting square in E(T) 


Diagram 5. 


is a pullback iff 
TtVx € B,y € C(H'(x)= K’(y) Elz €A (F(z) = x AG(z)=y)). 
Proor. The proof one way is straightforward, from the formula we can 


deduce the pullback property. Now let K and H be as above. Consider the 
diagram 
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| enn 


M H 


Cx °D 
Diagram 6. 
where L and M are the “‘projections”’ of 
E = {z:(B,C]|4x €B.y € C(z =(x,y) a H'(x) = K(y))}. 


This is a pullback (using the easy direction of the lemma). The result now 
follows from the uniqueness of pullbacks up to isomorphism. 


5. Topoi 


Here we introduce elementary topoi and give some examples. In 
particular we show that for any theory T the category E(T) is a topos. We 
shall see in Section 7 that every topos is of this form. 


5.1. DEFINITION. A topos is a Cartesian-closed: category with a subobject 
classifier. 


To understand this we need: 


5.2. DEFINITION. A subobject classifier is a morphism true :1—) (this 
distinguishes the codomain and an important ‘‘element’’) such that pull- 
backs along true exist and for any mono m:A’>»A there is a unique 
morphism cl(m) (the classifier of m) making the diagram a pullback. 


Ay" 4A 


Diagram 7. 


0 is to be thought of as a type of “‘truth values’’. It is immediate from the 
definition that in a topos the set P(A) of subobjects of an object A is in 
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1-1 correspondence with the set Hom(A, Q) of morphisms from A to 2. 
To determine a function whose values are truth values, it suffices to 
determine the subobject on which it takes the value true. We shall often 
need to refer to the morphism true, : A > 2 which is given by composing 
true:1—-Q with the (unique) morphism from A to 1. A morphism 
f:A—Q factors through true iff f =true,. A subobject classifier if it 
exists is unique up to a unique isomorphism. 


5.3. THEOREM. The category E(T) is a topos. 


Proof. The subobject classifier is given by taking as © the object 


{x:{ [7 
true being the morphism given by 
ax.ly:[ ly ). 
For a monomorphism M: A'>>A we take 
Ax.ly:{ ](Qy( )@3azEA’.x = M(z)). 
Using the characterization of Lemma 4.7 it is easily seen that this gives a 


pullback and is unique with the property. Since we already know E(T) to be 
Cartesian closed with finite limits we are done. OO 


Of course a similar syntactic construction may be used to produce a 
topos from a classical set theory. 


Example. Let L be the first-order language of set theory, T a theory in L 
containing classical Zermelo set theory. The category of definable sets and 
functions of T is a topos. It has as objects formulae g(x) of L (we use 
parentheses to indicate a complete non-repetitive enumeration of the free 
variables of a formula) such that 


Thay Vx (o(x)@x Ey). 


As morphisms from g(x) to w&(y) we take equivalence classes (under 
T-provable equivalence) of formulae @(x, y) such that 


TEVx, y (O(x, y)—> o(x) a Wy), 
TEWx (p(x) Az Vy (6(x% y)<y = z)). 


We leave the definition of composition to the reader. In this example the 
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algebra Hom(1, 92) of definable elements of © is just the Lindenbaum 
algebra of T. 


Before going on to develop some of the theory of topoi we introduce 
some more concrete examples. 


5.4. DEFINITION. Let 0 be a complete Heyting algebra. An 0-set A is a set 
A equipped with a “‘symmetric, transitive 0-valued relation’, that is a map 


e:AxA>-QD 
satisfying 
e(a,b) = e(b_ a), e(a, b) a e(b,c)se(a,c) 


for a,b,c EA. 


An -set A = (A, e) should be thought of as a Heyting-valued set with 
partial elements. The degree of equality of two elements is measured by e 
the degree of existence of a being given by e(a, a). 0-sets can be used to 
give a semantics for our logic (see FouRMAN and Scott [1977]). In fact it 
was with this model in mind that the logic was axiomatized. 


5.5. DEFINITION. A morphism of Q-sets from (A, e) to (B,f) is an exten- 
sional, total, single-valued Q-relation, that is a map g:AxB->2 
satisfying 

e(a, a') a f(b, b') a g(a, b) = g(a’, b’), 


e(a,a)= V (a,b), 


8(a, b) a g(a, b’) = f(b, b’) 
for a,a'E A and b,b’'E B. 


Example. The category of Q-sets and their morphisms forms a topos, 
where the composition of g:A—B with h: B—C is given by 


(heg)(ac) = V (g(ab).a (b,c). 


In the category of N-sets we have 2 =(0,<>) where — is the bi- 
implication. In the case where 2 is the Heyting algebra O(X) of open sets 
of a topological space X, the category of Q-sets is (equivalent to) the 
category Top(X) of set-valued sheaves over X. 0-sets have been described 
independently by Hiccs [1974] who (among other things) gives a proof of 
this equivalence. 
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More Examples. (i) The category of finite sets and arbitrary functions 
between them is a topos. 

(ii) Let 2 = (A, &) be a model for Zermelo set theory. The category of 
sets and functions of 2% whose objects are the elements of A and whose 
morphisms from a to b are those f in A such that YU (f is a function from 
a to b) is a topos. 

(iii) Let G be a group. The category of G-sets has as objects, sets 

,equipped with a G-action and as morphisms functions preserving this 
action. It is a topos. 


In these last three examples 2 is the set {true, false} (with the trivial 
action in the case of G-sets). The first two show the lack of axioms 
corresponding to the set-theoretical axioms of infinity and replacement in 
the definition of topoi. We shall discuss these axioms in Section 8. 

Let G be the cyclic group of order 2 and consider the G-sets (a, e) and 
(b, f) where a is a singleton and e the trivial action, b is a two element set 
of f the non-trivial action. We think of (5, f) as having two elements (in the 
sense that the formula 4x, y.—x = y is satisfied) which the permutation 
makes indistinguishable thus preventing any of the functions from a to b 
being “‘definable”’. There are no morphisms from (a, e) to (0, f). In this way 
we apply the syntactic notion of definability to the category of G-sets, a 
priori a semantic entity. 


6. Logical constructs in topoi 


Here we develop that part of topos theory we shall need. This theory is 
due to Lawvere and Tierney [1970]. The development is reminiscent of 
that of algebraic logic. The Cartesian-closed structure gives us certain 
“finite types’? and encapsulates the fact that definable maps are closed 
under composition, A-abstraction and pairing. The correspondence be- 
tween the set P(A) of subobjects of A and Hom(A, Q) allows for a fruitful 
interplay between the algebraic properties of the category and the logical 
properties of Q. 


6.1. DEFINITION. The equality morphism =,:A X A © classifies the 
diagonal A, =(A,A):A—>A XA. 


6.2. LEMMA. Topoi are finitely complete. 
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Proor. It suffices to show that we have equalizers. Given f, g : A — B, the 
subobject of A classified by =s,°(f,g) is easily seen to be their 
equalizer. O : 


6.3. DEFINITION. We give the “‘truth table” definitions 
A:Qx2—> classifies (true, true):17 0x0 
@2:0x2— classifies (0,0):0-0x2 


and extend these operations to each poset P(A) of subobjects of an object 
A. Confusing a subobject with the corresponding morphism A > Q, let 


anb:=n°(a,b), 
a<ob:=<0(a,b), 


and 
a—b:=(anb)oa. 


In a topos we can represent the subobject poset P(A) as a family of sets 
in a natural way. This will enable us to get a better hold on the operations 
we have just defined. 


6.4. DEFINITION. To each subobject a of A we associate the set fa] of 
morphisms with codomain A which factor through a. Equivalently, 
identifying a with the corresponding morphism a: A > we write 


faJ={x:X—-A |acx = truex}. 
6.5. LEMMA. (1) a <b iff Ja] C [lb]; 
(2) x Ea a bl iff x Ela] and x Eb]; 
(3) exanbiffesaandc=b,; 
(4) x Ela bd] iff for all y, xoy Ela] iff xoy Eb]; 
(5) x Ela > b] iff for all y, if xoy Ela], then xoy Eb]; 
(6) cexa—>biffcrnas=b. 
Proor. These are all quite straightforward. For < note that 
x€la<b] iff (a,b)°x factors through (0, 0) 
iff acx = box 
iff [a ox]J=[box] 
but as y €Jacx] iff xey Efa] we are done. O 


In showing that topoi have equalizers we constructed a subobject of A 
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by using the ‘internal logic” to describe those elements which should be in 
it. We shall use this idea again. 


6.6. DEFINITION. Given a € P(A X B) we have a morphism 4: B> 1%. 
We also have f: B-> 9% the transpose of f: A X B>1 —™, 0. We define 
ITA(a) to be the subobject of B classified by 


=, 0(4, ft). 


Here we think of @ as taking each x in B to the set of y in A such that 
(y,x) is in a and ¢ as taking each x in B to A. ITA (a) is to be thought of as 
the set of those x in B such that for all y in A the pair (y, x) isin a. As yet 
of course we can only make such thoughts precise when talking of topoi 
which we know to be of the form E(T). They are however invaluable to the 
intuition. 


6.7. LemMa. x ETA (a)] iff A x x Ela]. 


Proor. x EJMA(a)] iff Geox =fox iff asAxx=teAXxx iff Axx 
Efe]. O 


Since the morphisms in a topos correspond to total maps they give us no 
direct information about partial elements. To interpret statements about 
partial elements in a topos we must use the higher-order structure to 
represent each object A as a subobject of some A thought of as the 
domain of potential elements of A. In a topos E(T) we can construct such a 


domain letting A be 
{x :[A]]Wy, 2: A((x(y) a x(z))> (y= z ay EA))} 


with the embedding of A given by Ax.{z : A | x = z}. This representation 
has the important property that predicates on A are in 1-1 correspondence 
with subdomains of A since x =y{z:A lz =x}={z:A |z = y}. The 
rest of this section is devoted to generalizing this construction to arbitrary 
topoi and examining its categorical properties. 


6.8. DEFINITION. We write {-}, for =5:B—>? and call it the singleton 
morphism. Now let €: B x 2? — O classify (B,{-}): B > B x Q? and take 
to be an equalizer. 


B—>—_+. 9 


Diagram 8. 
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Since B X (evo{-}of) = =, °(B X f) classifies (f, X): X — B x X, we see 
that f is recoverable from {-}ef. Thus {-} is monic. Also =, and 
&°(B x{-}) are equal since they both classify the diagonal. Thus {-}= 
&o{-} and we have a factorization 


{-}= eon, 


where 7 is monic as {-} is. We call 7 : B>> B a partial map classifier for B 
(see 6.12). 


6.9. Lemma. The subobject e : B > 2? is a retract (i.e. there is a morphism 
r:Q? —B such that ree = B). 


Proor. Since £o{-}={-} we have a pullback. 


B{- 
Co a 


Bxé 


B—————>B x 0” 
(B,{+}) 


Diagram 9. 


Thus ¢°B x = & (they classify the same subobject), hence °é = € and 
we have a factorization 


a 


€&=eor. 


Now e°B =e = ee =ecrce and B=rce ase is monic. 0 


6.10. Lemma. The projection x : B x X — X is split (i.e. there is a morphism 
s:X—Bx X such that res = X). 


Proor. It suffices to find a morphism from X to B and pair it with the 
identity on X. Let t =truesxx and let s=(ret,X). O 


This lemma expresses in a categorical way the fact that B is inhabited. 


6.11. DeFiniTION. A partial map from A to B is a pair (f,d): A—B 
where d: A’>>A is a subobject of A and f: A’—> B. 
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6.12. THEOREM. Given a partial map (f,d): A—B there is a unique 
morphism f : A > B making the diagram a pullback. We say that f classifies 
the partial map (f, a). 


Diagram 10. 


Proor. Given a partial map (f,d):A—B we have a_ subobject 
(f,d):A’—>BXA called the graph of (fd). This is classified by a 
morphism f* : B x A —Q. We show that f* factors as eof: A>? and 
that f has the desired properties. 

If{-}oh = f*og, then =,°B X h = f*°B X g, so =p o(h, h) = f* o(h, g). 
But =5 °(h,h) factors through true so (h, g) factors through (f, d). Since 
(f,d) is mono, this factorization is unique. 

Also =, °B X f = f*°B Xd since they both classify (f, A’). We have 
shown that the square (i) is a pullback. Thus (ii) is a pullback so 


d (fd) 


Ay———A A'——_4" +B xA 
f t f B xf 
By 7.0" B———+B xa" 
Diagram 11(i). Diagram 11(ii). 


Aw a & 
EBX f* = f* since they both classify the same subobject. Hence &°f* = 
f* and we have a factorization 


nN eS 
fr=eof 


We obtain the diagram 


~~ 


{+} 


Diagram 12. 0? 
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where the inner square commutes since e is monic and is a pullback since 


the outer one is. 
Conversely, given such-a diagram with the inner square a pullback, the 


square 


AL_£4)_ Bx A 


BT en) BXB 


Diagram 13. 


is a pullback. 
Thus € 9.B x (e° fy= fK (since they classify the same subobject). Hence 
Eoeof= a and éof = fr. As e is monic, this tells us that f is unique and 
we are done. C1 


7. Interpretations in topoi 


We extend our talk of ‘“‘elements”’ of an object by showing how the logic 
of Section 3 may be interpreted in an arbitrary topos. 


7.1. DEFINITION. Let E be a topos, L a language. An interpretation of L in E 
is a map ¥# assigning to each sort A an object A, to each constant c of sort 
A a morphism [c]:1—> A and to each finite sequence (Ao,..., An-1) of 
sorts an isomorphism 


[Ao,..., An—1] = 


By abuse of notation we write A for A. We also suppose for simplicity 
that [Ao,...,A,-1] = 2: Strictly speaking the “canonical” isomorphisms 
given by the interpretation are necessary because of the abstract nature of 
the power-type map. Having realized this it is best, in the interests of 
clarity, to forget them. 


7.2. DEFINITION. If A is a finite sequence of distinct variables of L we 
define an object XA of E by induction on the length of A 
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X( )=1, X(4,x)=X4xA_ (where #x =A). 


Free variables are to range over potential elements, XA is to be thought of 
as the object of sequences of potential elements over which the sequence A 
shall range. If rng 4 CrngI, then the product of projections gives a 
morphism 


wi:Xl>XA 


which should be thought of as taking an n-tuple of partial elements of XI 
to the corresponding m-tuple of partial elements of XA. If A= 
(X0,...;Xn-1, y), where #y = [#Xo,...,#Xn-1] there is a morphism 


ev: X#x, xX W* 0 
which classifies some subobject. Composing this subobject with the mono 
Xn: X #x, X OX" XA 
we get a subobject of XA whose classifier we denote by 
év, :XA >. 
We shall also need the morphisms 


nn. : AXX(4\ x)->XA_ where #¥x=A 


given by taking 7, °7 in the x coordinate and the appropriate projection 
in each other one. These morphisms serve to restrict our attention to those 
x which exist and will be used in the interpretation of the variable binding 
operators. 

Finally let E, : A > classify na. 


7.3. DEFINITION. For g¢ a formula, r a term and A a non-repetitive finite 
sequence of variables containing the free variables of g and 7, we define 
the valuations 


[ela XA >, 
[tr], :X4—>A_ (where 7: A). 


A formula is interpreted by a subobject of the domain of tuples of potential 
elements. A term is interpreted by the total map which to a tuple of 
potential elements assigns the appropriate potential element of A. The 
definition is by induction on structure according to the following schemata: 
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[cls =[c]er?,, 
[x]. = 7), 
[Ir =o], = =4°(th,fo].), 
[Er], = €. ofr, 
[7 (a0, ..., On-1la = CVs (ola, ..-, on-illa, 7] ), 
le av], = aeeL,[vl), 
le > dL => eel. [hL), 
[Vx.¢]li..= TA (¢]s °7:). 


To define [Ix.o]a.. given [¢]a, let ¢ : X(A\x)—>% be the transpose of 
[¢], °n,. Pulling ¢ back along {-}4 we get a partial map X(A\x)— A. Let 
(tx. Ja. be the classifier of this partial map. 

To supplement the last two clauses we must show how to define [#], and 
[7]. from [WJ,.. and [7]... where x ¢ FV(w), FV(r). This is done by the 
schemata 

fel =fel-ew?; = Url, =Erl-ezr, 


taking I to be Ax. Throughout this definition we have made the tacit 
assumption that #x = #7 =A. 


7.4. DEFINITION. The interpretation # satisfies ¢ (symbolically # F ¢) iff 
whenever FV(¢) C rng A the morphism [¢]. factors through true. If T is a 
theory, we say # is a model of T, symbolically ¥  T, iff for all ep € T we 
have J F ¢. 


It follows from Lemma 6.10 that 74 is split, thus [g], factors through 
true iff [|g], does, where A is some enumeration of FV(¢). It follows that 
So iff [ge], factors through true for some I with FV(¢) Crng I. 


7.5. Definition. If + is a term with FV(r)C rng and rng(A4.x) Cmgl 
we define the morphism 

[r/x]a:X XA 
to be [7], in the x-coordinate and the appropriate projection in the others. 


If x ¢ rng A, then [7/x]4 is just 74. 


7.6. LEMMA. 


[o[t/x]lr =Go]ae{r/x}a, Le [r/x]Ir =T¢ ls e[7/x]a. 
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Proor. By induction on the structures of o and gy. O 


7.7, LEMMA. Writing |]. for [lel] we have 
Dele /x]lr = {f |[r/x]eef Ele}, 
{Er ={f|[rl of factors through n}, 
Ir =ols ={f|IrL of =fob of}, 
lo Av =ToL [NIL 
le > vb ={f| for all g(fog Elel > fog Elva), 
IVxela..={f| na xf Ele hs}. 


Proor. The first of these follows from the preceding lemma, the rest from 
Lemma 6.5 and the definition of valuation. O 


7.8. SOUNDNESS THEOREM (for interpretations in topoi). If g is an axiom, 
then  F o; furthermore if F satisfies the premisses of a rule of inference, then 
it satisfies the conclusion. 


Proof. The propositional axioms and (=) are easily checked using the 
preceding lemma as are the rules of substitution and modus ponens. We 
now check the remaining axioms and the introduction rule for the 
quantifier individually. 


Axiom (E). For f: Z—A x A, we show that if 


fEelWz(z=xoz=y)k.», 


then 7:°f and 7.°f classify the same partial map. By the uniqueness of 
partial map classifiers, this shows that they are equal and we are done. 
Let f be as above and g be such that we have a factorization 


m2° fog = nach. 
It suffices by symmetry to show that 
mcfog = nach. 
Now 
(m2° fog, fog) =(na X foe(h, g)Elz =x oz =yhexy. 
But 
(mefog, fogvElz =yhexo» 


1080 FOURMAN/THE LOGIC OF TOPOL {cu. D.6, §7 


(72° f °g, fegyelz =<xlesyy 


m2.°fog = mofog, 


Axiom (WV), Rule (V*). We deal with the rule and the axiom together, by 
showing that if x € FV(¢) then Ex a » — w is satisfied iff ¢ > Vx : Aw is. 
Now 


(ela. =[Vxd]i.. = WA d]en.) iff [elacn. =[¥h on. 
iff fe sEx] = []s. 


Since y —> w is satisfied iff [¢], =[W]a4 we are done. 


Axiom (1). We must show that for all f: X >XA we have 
na X fEly =k oVx (pox =y)la. 
From the definition of [xg ] we see that for h: YA x XA, 


Naomoh =[Ixe]ona x XA ch 


=,°T2°A xh =f[e]ona X Na xXAoA xh. 
Thus for all g: Y >A XX we have 


NA x fog Ely =Ix¢ |. 
iff 
na Xfeg E]Vx(pox =y)ha 


(as =4°Na X Na = =a). 
Axiom (Comp). It suffices to show that [ly Vx (¢ < y (x))] factors through 
n. We show that 

[ly Ve (ep oy))I= n°, 


where ¢ is the transpose of [ ¢ Ji. °Xn X XA. For this it is enough to show 
that 


Hao Ox § =[VE(Y oy(X))]o ny. 
Now since eveXn X 7 = ev, we have 


fEMVs (e y(X))h.aon] 
iff 
[eles °Xn X (72° f) = evoXA; x (moe f) 
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@ ° m72°f = 7, of 
FED Harte Gf 
Axiom (Pred). This is immediate from the definition of év. 


7.9. DEFINITION. To each interpretation we associate a theory 
T(F) = {9 | FF oh. 


For any theory T the canonical interpretation %(T) of T in E(T) is defined 
by interpreting L(T) in E(T) as follows: Each sort A is interpreted by the 
type A ={x:A |x =x}. If c is a constant of sort A in L(T) we have a 
morphism 

Az:[ J.c:{z:[ ]]z€ )aEc}oA 


defined on a subobject of 1. We let 
[c]:1— A 


be its classifier. 
7.10. THEOREM. #(T)F ¢ iff Tho. 


ProoF (outline). Since every formula is logically equivalent to a 
description-free formula and our interpretations are sound it suffices to 
consider description-free ¢. The result follows from the fact that for such ¢ 
the morphism 


[els :X4-> 2 
is just 
Axo,..-,Xn-aly:[ ]JQyC )oee (lz x0(z),..., lz xn-1(z)) 


which is proved by a tedious induction on the structure of g. 


7.11. CoROLLARY. Our axiomatization of the logic of partial elements is 
complete for interpretations in topoi. 


For a proof of a completeness theorem for our logic along more 
traditional lines see FouRMAN and Scortr [1977]. Let # now be an 
interpretation of L in E. For A a definable type we extend the abuse which 
identifies a sort A with the object interpreting it by writing A for the 
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subobject of A classified by [x € A]. Since tx € A—>Ex we have a 
factorization of the inclusion ys = ya °i, :A > A. 
7.12. Lemma. If A and B are definable types and F is a definable relation 
from A to B, then 

JEVXEA.F(xX)EB 


iff there is a morphism f:A—B in E such that either of the following 
(equivalent) diagrams commutes. 


~ ~ Ya ~ 
Ax B7*”_.AxB A———__—__5A 
fx Bi [IF(, yl f [F’(x)] 
B x B =s 9) B Ya 3 


Diagram 14. 


In this case f is uniquely determined by F. Two such relations F and G 
determine the same morphism iff 


JEVx ECA.F'(x)= G(x). 


Proor. FWx CA. F’(x)EB iff (F’(x)]o ys factors as ysof (f is then 
uniquely determined as y is monic) iff 


(F(x, y))’° ya ={-}s°is of 
(see definition of [Ix.¢] on p. 1078) iff 
ff (F(x, y lone X ya = =2°B X (is of) 
i 

[IF (x, y lo ye X ya = =e ° BX ff 
The last part is obvious from the fact that 


FVx €A,y © B.(F(x, y) G(x, y)) 
iff 
IF (x, yo ye X va = 1G (x, yo ye X va. O 


Let # be an interpretation of L in E. If F is a definable relation from A 
to B such that SE Vx € A.F’(x)€ B we shall as a temporary (and most 
abusive) notation write F for the corresponding morphism from A to B in 
E (taking care that there can be no confusion as to the intended domain and 
codomain). 
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7.13. THEOREM. (1) FoG =H in E iff SE Wx € A(F'(G'(x)) = H’(x)). 
(2) A is a terminal object in E iff FFVx,y EG A.x=y. 
(3) A< CB is a product in E iff 


JSEVX EA, Vy © B.Elz €C(P'(z)=xraQ(z)=Yy). 
(4) ‘ 


Diagram 15. 


is a pullback in E iff it commutes and 
SEVWx CB, y€C (F(x) = G'(y) > Elz € A (K'(z) =x AH(z)=Yy)). 
(5) EV: A X X—B is an exponent in E iff 
SEVz:[A, B](Wx € A, Ely € B. z(x,y) 
—>Elw € X.Vx EA, y © B(z(x, y) @EV(x, w) = y)). 
(6) T:1—X is a subobject classifier in © iff 
SEWVx:[ ].Elyex(y=T’(*)x( )), 


where * is the term Iz :1.z =z. 


Proor. Straightforward using the methods of Section 4. 0 


8. Topoi as theories 


8.1. Derinition. If E is a topos L(E), the language of E has as sorts the 
objects of E and as constants of sort A the pairs (c, A) where c :1—> A. The 
power-type map is given by 


[Ao, iste An-1] = ar, 


The canonical interpretation I(E) of L(€) in E is given by interpreting each 
sort by itself and letting [(c, A )] = c. We may often write c for (c, A) where 
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confusion is unlikely. The diagram or theory of E is the set T(E) of 
formulae of L(E) satisfied by the canonical interpretation. 


8.2. THEOREM. If F is a topos C(T(F)) = F. 


Proor. The objects of C(T(F)) are the sorts of L(F) which in turn are the 
objects of F. Each morphism f : A — B in F gives rise to a constant c of sort 
{A, B] of L(F) corresponding to the transpose of =,°B x f. By Lemma 
7.12 the relation 


f =z :[A, B].Vx:A, y: B(z(x, y)c(x, y)) 
represents a morphism of C(T(F)) and every such morphism arises from a 


unique morphism in F. That composition in C(T(F)) corresponds to 
composition in F follows directly from Theorem 7.13(1). O 


We now consider the question of why C(T) is not in general a topos. This 
happens because it may lack objects. 


8.3. Derinition. A theory T is definitionally complete iff for each definable 
type A there is a sort B, and definable relation F from B to A such that 


TtVx :B.EF’(x), 
TtVx,y: B(UF'(x)= F'(y)> x = y), 
TtWz:A (3x: B.z = F’(x)@z € A). 


8.4. Lemma. If F is a topos, T(F) is definitionally complete. 


ProoF. Consider the canonical interpretation of T(F) in F. If A is a 
definable type the morphism 


[x EA]Jon: A> 
classifies a subobject of A which gives rise to the required sort and 


definable relation. [J 


There is a canonical functor I (taking A to {x:A | Ex}) which embeds 
C(T) as a full subcategory of E(T). 


8.5. THEOREM. I is an equivalence of categories (every object in E(T) is 
isomorphic to one in the image of C(T)) iff T is definitionally complete. 


cu. D.6, §8] TOPOL AS THEORIES 1085 


Proor. Immediate. O 
8.6. CoroLLary. If T is definitionally complete C(T) is a topos. 


8.7. DEFINITION. A functor H :E—F between topoi is logical (a morphism 
of topoi) iff it preserves finite limits, exponents and subobject classifiers. 


In logical terms, a logical functor may be pictured as an interpretation of 
one theory in another which is standard in that it interprets power-sorts by 
power-sorts. 

If H:E—F is logical, then for any interpretation # of L in E, the 
composite map He ¥ gives an interpretation of L in F. (The necessary 
isomorphisms being given by the uniqueness up to a unique isomorphism 
of products exponents and subobject classifiers.) 


8.8. LEMMA. 
lelu-s= Hels, 
[7 ]e-s = H[rl]. rf 


Proor. By induction on the structure of g¢ and 7. O 


8.9. THEOREM. For any model $ of a theory T ina topos F there is a logical 
functor H:E(T)—F unique up to a unique isomorphism such that 
Ho f(T)= F. 


Proor. Firstly we note that uniqueness up to isomorphism follows from 
the preceding lemma together with Lemma 7.12. Since ¥ is a model of T, 
Lemma 7.12 tells us how to associate a morphism f = H(F): A > B inF to 
a morphism F:A—B in E(T). That this map is in fact a logical functor 
follows from Theorem 7.13. 0 


We see that topoi correspond to certain definitionally complete theories 
and logical morphisms correspond to interpretations which are models for 
these theories. Every theory T has a definitionally complete conservative 
extension — the theory associated to E(T). (In fact this extension is not 
only conservative, but also inessential in the sense that it has a natural 
interpretation in T: its expressive power is no greater than that of T.) 

As remarked at the end of Section 5, the axioms for topoi contain no 
analogue of the set-theoretic axioms of infinity and replacement. The force 
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of the replacement axiom is to replace an unbounded quantifier by a 
restricted one. Since we have no categorical analogue of the unbounded 
quantifier the absence of replacement from the topos axioms is not 
surprising. A categorical version of an axiom of infinity (due to Lawvere) is 
given by demanding that there be a natural number object (NNO). We 
shall examine briefly which theories correspond to topoi with NNO. 


8.10. DEFINITION. A natural number object is an object N equipped with 
morphisms 1 4N—>N satisfying the recursion property: for any diagram 
1>A SA there isa unique morphism k : N— A such that k 00 = a and 
fok =kos, 


8.11. THEOREM (HATCHER [1968], Ostus [1975], Folklore). An object N with 
the structure 1-> N-> Nis a NNO in the topos E iff the following formulae 
(Peano’s Axioms) are satisfied by the canonical interpretation of L(E) in E: 


(P3) Vx :N.—70=s’(x), 
(P4) Vx,y:N(s'(x)= s'(y)>x = y), 
(PS) VX :[NJQEX vaAVx E X.s'(x)Ex Vx: N.x EX). 


A converse also holds: 


8.12. THEOREM. Let T be a theory in a language with a constant 0: Nanda 
definable relation s from N to N such that T proves the axioms P3-P5 of 
Theorem 8.11 and in addition 


(P1) E0, 
(P2) Vx: N.Es’(x). 
then in &(T) we have a natural number object 
1>N->N. 
We now consider briefly two applications: 


8.13. THEOREM (Mikkelsen, PARE [1974]). Topoi have finite colimits. 


PRooF. It suffices to consider topoi of the form E(T) since by Theorems 8.2 
and 8.5 every topos is equivalent to one such. As in Section 4 we may now 
use straightforward logical constructions of colimits. It suffices to have an 
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initial object ({x :[ ]|L}), disjoint sums (A + B may be constructed as a 
subtype of [[A],[B]]) and coequalizers (the coequalizer of F and 
G : A — B) has as codomain a subtype of [B]). O 


Other categorical properties of topoi — exactness properties, the 
existence of a right adjoint to pulling back etc. — follow easily using this 
method. As a second application we consider the construction of free 
topoi. (In essence our method is the same as that of VoLGER [1975].) 


8.14. DeFinirion. Let G be a directed graph (in the sense of MAC LANE 
[1971] p. 48). By a condition C on G we mean a statement in the expansion 
of the language of category theory having names for the nodes and paths of 
D. It is evident what it means for such a condition to be satisfied by a 
morphism from G to (the underlying grpah of) a category C. 


8.15. DeFiniTion. The language L(G) of a directed graph G is constructed 
as follows. The set of sorts of L(G) is the least set of expressions containing 
the nodes of G and closed under the syntactic operation 


(Ao, eeey An-1)? [Ao, eee ,An-i] 


(here we consider the nodes as formal symbols and the square brackets and 
commas as formal punctuation marks). For each edge f: A > B of G we 
have in L(G) a constant f of sort [A, B]. 


8.16. Lemma. For f: A—B in G let F=Ax.ly: B.f(x,y) be the corre- 
sponding relation in L(G). To any morphism F from G to a topos E there 
corresponds an interpretation #(F) of L(G) in E such that for each 
f:A>BinG, 

EWx:A.EF’(x). 


Conversely to any such interpretation there corresponds an essentially (up to 
isomorphism ) unique morphism from G to E. 
8.17. DeFinitTion. A condition C on G is said to be internally expressible iff 
there is a formula g of L(G) such that F satisfies C iff 

I(F)E ¢. 
8.18. THEOREM. If G is a directed graph and € a set of internally expressible 


conditions on G, then there is a topos F and a morphism F : GF satisfying 
(the conditions in) ©, such that for every topos E and morphism H:G—E€ 
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satisfying € there is a logical functor K :F—>E unique up to isomorphism 
such that Ko F = H. 


Proor. The required topos is just E(T) where T is the theory in L(G) 
having as axioms 


Vx: AEF’(x) forf:A-~BinG 
and for each C € © the corresponding formula ¢ in L(G). O 


This theorem gives many free topoi since as we have seen many 
conditions are internally expressible. Here is a short list of such conditions: 
To be monic (epic, iso); to be a (co)product; to be a (co)equalizer; to be 
initial (terminal); to be a subobject classifier; to be an exponent; to be a 
pullback (pushout). 

We hope in this chapter to have demonstrated (at least) two things: that 
categories are not as mysterious as they seem, and that topoi are not 
mysterious categories. 

We have shown that in talking about some categories (topoi) we may talk 
concretely in terms of elements. This process may be extended to other 
categories by suitably embedding them in topoi. The lack of ‘‘elements”’ is 
no more mysterious than is a theory in which t x¢ (x) without there being 
a term 7 such that + g(r). 

Topoi result if we apply intuitionistic logic (with partial elements) to the 
basic intuition of (finite) power types satisfying comprehension and exten- 
sionality. Other category theoretic abstractions may by viewed similarly: 
for example any abelian category may be represented as a full sub-abelian 
category of the category Ab(E) of abelian groups of some topos E. Thus it 
may be viewed as a category of abelian groups in some appropriate logic. 

To conclude we mention an embarassment. The important morphisms 
between topoi are geometric morphisms (ARTIN, GROTHENDIECK and VER. 
DIER [1972]). Our treatment does not explain this fact. REvEs [1975] (see 
also Chapter A.8 of this volume) has shown how a Grothendieck topos may 
be viewed as the extension of Sets obtained by adding a generic model for a 
suitable (possibly infinitary) first-order theory T. Geometric morphisms 
then arise naturally from a consideration of models (in Grothendieck 
topoi) of such theories. This approach however depends on some fixed 
“‘base topos”’ (in this case Sets) and fails to explicate the notion of topos in 
abstracto. 


REFERENCES 1089 


References 


ArRTIN, M., A. GROTHENDIECK and J.L. VERDIER, editors 
[1972] Théorie des topos et cohomologie étale des schémas. Téme 1, Séminaire de 
Géométrie Algébrique du Bois-Marie 1963/64 (SGA 4) Lecture Notes in Mathe- 
matics, Vol. 269 (Springer, Berlin). 
BoILEAU, A. 
[1975] Types vs topos, Mimeo, Université de Montréal, Montréal, Canada. 
Coste, M. 
[1974] Logique d’ordre supérieur dans les topos élémentaires, Mimeo, Seminaire Be- 
nabou, Paris. 
FOURMAN, M.P. 
[1974] Connections between category theory and logic, Ph.D. Thesis, Oxford. 
FourRMAN, M.P. and D.S. Scotr 
[1977] Sheaves and logic, in: Applications of Sheaves, Proceedings of Durham Sym- 
posium, to appear. 
Gray, J.W. 
{1971] The meeting of the Midwest Category Seminar in Zurich, in: Reports of the 
Midwest Category Seminar V, edited by J.W. Gray, Lecture Notes in Mathematics, 
Vol. 195 (Springer, Berlin) pp. 248-255. 
HATCHER, W. 
[1968] The Foundations of Mathematics (Saunders, Philadelphia, PA). 
Hiaos, D. 
[1974] A category approach to Boolean-valued set theory, preprint, Waterloo, to appear. 
LAWVERE, F.W. 
(1970] Quantifiers and sheaves, in: Actes du Congrés International des Mathématiciens, 
Tome t, pp. 329-334. 
[1972] Introduction, in: Toposes, Algebraic Geometry and Logic, edited by F.W. Lawvere, 
Lecture Notes in Mathematics, Vol. 274 (Springer, Berlin) pp. 1-12. 
[1975] Introduction, in: Model Theory and Topoi, edited by F.W. Lawvere, C. Maurer and 
G.C. Wraith, Lecture Notes in Mathematics, Vol. 445 (Springer, Berlin) pp. 3-14. 
LAWvERE F.W. and M. TiERNEY 
[1970] Lectures on elementary topoi, Midwest Category Seminar, Zurich. (Summarized in 
Gray [1971].) 
Mac Lane, S. 
(1971] | Categories for the Working Mathematician (Springer, Berlin). 
Ostus, G. 
[1975] Logical and set theoretical tools in elementary topoi, in: Model Theory and Topoi, 
edited by F.W. Lawvere, C. Maurer and G.C. Wraith, Lecture Notes in Mathe- 
matics, Vol. 445 (Springer, Berlin) pp. 297-346. 
Par, R. 
[1974] Colimits in topoi, Bull. Am. Math. Soc., 80 (3), 556-561. 
Reyes, G. 
(1975] From sheaves to logic, in: Studies in Algebraic Logic, edited by A. Daigneault, 
M.A.A. Studies, Vol. 9 (Math. Assoc. Am., Buffalo, NY) pp. 143-204. 
ROUSSEAU, C. 
[1977] Complex analysis and topoi, J. Pure and Appl. Algebra, to appear. 
Scott, D.S. 
[1969} Boolean models and non-standard analysis, in: Applications of Model Theory to 
Algebra Analysis and Probability, edited by W.A.J. Luxemburg (Holt, Reinhart 
and Winston, NY) pp. 87-92. 


1090 FOURMAN/THE LOGIC OF TOPOI 
[1977] Identity and existence in intuitionistic logic, in: Applications of Sheaves, Proceed- 


ings of Durham Symposium, to appear. 
TAKEUTI, G. 

[1977] Boolean-valued analysis, in: Applications of Sheaves, Proceedings of Durham 

Symposium, to appear. 

VoLGER, H. 

[1975] Logical categories, semantical categories and topoi, in: Model Theory and Topoi, 
edited by F.W. Lawvere, C. Maurer and G.C. Wraith, Lecture Notes in Mathe- 
matics, Vol. 445 (Springer, Berlin) pp. 97-100. 


D.7 


The Type Free Lambda Calculus* 


HENK P. BARENDREGT** 


Contents 

0. Introduction . 2. 1. 1. ee eee ee ee ee 1092 
1. Towards the theory . . 2. 2 1 eee ee ee ee ee es 1094 
2. Classical A-calculus  . w ) we ee ee ee 1100 
3. Construction of Pw . 2. 1 ew ee eee 1106 
4. Construction of Do. 2 6 6 6 ee ee ee ee TO 
5. Solvability 2 6 6 6 ke ee ee ee TT 
6. BOhmtrees . 2 2 1 1 ee ee ee ee ND 
7. Analysisof De. 2 2 6 - ee ee ee ee 1025 

References: 26-504: 6-8 eG eth ee ee a ee 3] 


* Part of the chapter was written while visiting the Forschungsinstitut fiir Mathematik, ETH, Zurich. 
** The author wishes to thank Dana Scott for his basic remarks on a draft of this chapter, and Jane Bridge and Jeff Zucker 
for pointing out several errors in the text. 


HANDBOOK OF MATHEMATICAL LOGIC 
© North-Holland Publishing Company, 1977 Edited by J. Barwise 


1091 


1092 BARENDREGT/LAMBDA CALCULUS [cu. D.7, §0 


0. Introduction 


The A-calculus and its variable free equivalent, combinatory logic, were 
initiated around 1930 by Church, Sch6nfinkel and Curry respectively. The 
intention of the founders of the subject was to study rules; in other words 
to study the old-fashioned notion of ‘‘function”’ in the sense of definition. In 
contrast to Dirichlet’s notion (of graph, that is the set of pairs of argument 
and associated value) the older notion referred to the process of stepping 
from argument to value, a process coded by a definition. Generally we 
think of such definitions as given by words in ordinary English, applied to 
arguments also expressed by words (in English). Or, more specifically, we 
may think of the definitions as programs for machines applied to, that is, 
operating on, such programs. In both cases we have to do with a type free 
structure, where the objects of study are at the same time function and 
argument. In particular, a function can be applied to itself. For the usual 
conception of a function in mathematics (in Zermelo-Fraenkel set theory) 
this is impossible (because of the axiom of foundation). 

The A-calculus represents a class of (partial) functions (A-definable 
functions) on the integers which turns out to be the class of (partial) 
recursive functions. The equivalence between the Turing computable 
functions and the general recursive functions was originally proved via the 
A-calculus: the general recursive functions are exactly the A-definable 
functions as are the Turing computable functions. 

The equivalence between the A-definable functions and the recursive 
functions was one of the arguments used by Church to defend his thesis 
proposing the identification of the intuitive class of effectively computable 
functions with the class of recursive functions; in fact one can give 
arguments for the so called Church’s superthesis which states that for the 
functions involved this identification preserves the intensional character, 
i.e. process of computation. 

Historically the first undecidable problem was constructed by Church as 
a problem about terms of the A-calculus (whether they have a normal 
form). The first definition, due to Church and Kleene, of the recursive 
ordinals went via the A-calculus. The fixed point theorem of the A-calculus 
inspired Kleene to the recursion theorem. Thus the A-calculus played a 
central role in the early investigations of the theory of recursive functions. 

Church originally designed the A-calculus as part of a general system of 
functions intended to be a foundation of mathematics. The paradox of 
Kleene and Rosser showed that this system was inconsistent. The present 
theory was extracted as a consistent subtheory, CHURCH [1941]. After this, 
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Church seemed to have lost interest in using the A-calculus to provide a 
foundation for the whole of mathematics. Curry et al. [1958, 1972], on the 
other hand, have developed various systems of illative combinatory logic 
intended as an ultimate foundation. These systems, however, have not 
been developed enough to be a satisfactory basis for mathematics. See also 
Scott [1975b] for work in this direction. 

Due to the type free approach it was not clear how to construct models 
of the theory. One would want a set X in which its function space X > X 
can be embedded, contradicting Cantors theorem. This difficulty was 
overcome by Scott in his D.. model constructions (in 1969), by restricting 
X-—X to the continuous functions on X (provided with a proper 
topology). Also in the graph model w, continuity plays an essential role. 

Scotts models added a new dimension to the theory, namely limit 
considerations. The author agrees with Scott’s claim that this really makes 
the theory’ A-calculus and what has been done before should be called 
A-algebra. See Section 7 where equalities in D. that cannot be proved 
algebraically are established by approximation methods. 

Typed versions of the theory, as well as their connections with category- 
and proof theory are purposely not considered. The character of the typed 
theories is totally different from the type free version, e.g. all typed terms 
have a normal form. See TroEtstrA [1973] as reference for the typed 
theory of (primitive) recursive functionals and MANN [1975] for the relation 
with category- and proof theory. 

It should be mentioned also that there is a theory related to the 
A-calculus, in which application is only partially defined. This is the theory 
of uniformly reflexive structures of Wagner and Strong. This theory has an 
obvious model in the partial recursive indices. In fact it is intended to be an 
axiomatization of parts of recursion theory. See BARENDREGT [1975] for an 
introduction and references. 


Summary 

Section 1 gives an introduction to the theory and provides a general 
model theoretic setting. Section 2 gives a treatment of the classical 
A-calculus. It will be proved that the recursive functions can be represented 
as A-terms and that many sets of A-terms are undecidable. In Section 3 the 
graph model Pw is constructed. Section 4 treats Scott’s construction of the 
models D.. as a projective limit of complete lattices. In Section 5 a A-theory 
# is introduced. # has a unique maximal consistent extension #*. Section 
6 associates to each term a tree, useful for the determination of its image in 
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the models Pw, D.. In Section 7 it is proved that D.F M=N iff 
M =NeE 2* iff M,N have equivalent trees. 


1. Towards the theory 


The A -calculus studies functions and their applicative behavior, and not, 
as in category theory, just their behavior under composition. Therefore 
application is the primitive operation of the A-calculus. The function f 
applied to the argument a is denoted by fa. 

Schénfinkel observed that it is not necessary to introduce functions of 
more variables. Indeed, for a function of say two variables f(x, y), one can 
consider g, with g, (y) = f(x, y), and then f’ with f’x = g,; hence (f'x)y = 
f(x,y). Therefore a convenient notation is hx,-++ x, = (+++ (hx1)-+ + Xn) 
(association to the left), the above example becoming f'xy = f(x,y). A 
similar construction occurs in the s-m-n theorem in recursion theory. 


1.1. DeFinition. An applicative system is a structure Yt = (X, -), where - is 
a binary operation (application) on X. 

The set of terms (using variables ao, a:,...) over IM, T(M?), is inductively 
defined as follows:,x,;€ T(M); aE XDcaET(M); A, BET(M)>S 
(AB) € T(M). c, is the constant corresponding to a. Juxtaposition of terms 
denotes application. 

A,A2-::A, denotes (-::(A,A;)-::A,) (association to the left). 


1.2. DEFINITION. A combinatory algebra is an applicative system Yt such 
that WM is not trivial (i.e. Card(X)> 1) and for each term A over Yt, with 
variables among y,,..., y,, we have in WP: 


1.3. AfVy.:+-ynfyi-*:yn= A  (combinatory completeness). 


A combinatory algebra M is extensional if in addition in M& 
1.4. Vx (fx =f'x) > f=f' (extensionality ). 


Combinatory completeness expresses that all algebraic functions are 
representable by an element. The motivation for this axiom is that for 
functions studied as rules, one certainly would like them to be closed under 
explicit definition. 

It is essential that in 1.3 A is purely algebraic and not defined using 
logical operations. A diagonalization would otherwise make the system 
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trivial. However, combinatory completeness is already quite sttong, e.g. 
there are no finite combinatory algebras and in fact no recursive ones. In 
contrast with e.g. the theory of fields there are 2"° prime combinatory 
algebras. 

In 1.3 combinatory completeness is expressed by an existential axiom. 
By an extension of the type of the language this can be expressed in a 
universal way (cf. the elementary theory of groups where the axiom 
Vx dy x-y =e can be expressed by x -x '=e after extending the lan- 
guage with ~'). In fact there are two ways to do this. The first one, 
employed by Church, adds to the language an abstraction operator A: if A 
is a term, so is Ax. A. Combindtory completeness now follows from 


1.5, (Ax. A)a = A[x/a] (B-conversion). 


Multiple abstraction can be replaced by simple ones following 
SchG6nfinkel’s idea: let Ax,:+-xX,.A = Ax,(Ax2.°+:(Ax,.A)--:); then 
(Axis Xn- A )ai+ ++ Qn = A[xi ++ Xn/a1*** an). 

The other approach, due to Curry, results from realizing that combina- 
tory completeness follows from two of its instances. 


1.6. THEOREM. Let Yt = (X,-) be an applicative structure such that for some 
k,s © X one has in M: 
(i) k#s, 
(ii) kxy = x, 
(iii) sxyz = xz(yz). 
Then M is a combinatory algebra. 


ProorF. First let i = skk; then ix = skkx = kx(kx) = x. By induction on the 
complexity of aterm A over Qt one can define A*x.A and show that in 2 
(A*x.A)a = A[x/a]. Let l=c, K =c, and S =c,; 


A*x.x =T; A*x.y =Ky_ if x,y are different variables; 


A*x.Cy = Key; A*x.AiA2= S(A*x.A1)(A *x. Az). 


Therefore for the terms over Yt one can define A-abstraction satisfying 
B-conversion, hence IM is a A-algebra. O 


Curry’s theory is elegant because of its simplicity. In fact the theory of 
combinators with constants k and s satisfying (i}-(iii) of 1.6 is the simplest 
theory which is essentially undecidable. Church’s notation is more intuitive 
however and will be used in this chapter. 
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The (formal) A-calculus is essentially the theory which has application 
and abstraction as primitives and B-conversion as axiom. In addition it has 
the notion of reduction which formalizes the fact that e.g. an expression 
like (Ax. x? + 1)3 can be computed to yield 10, but not conversely. Due to 
the Church—-Rosser theorem, reduction is very useful for the proof theory 
of the A-calculus. 

It should be stressed that in a theory about functions as rules, terms play 
a central role. This view differs with that of Scott, who puts the models 
central. It is true indeed that models are of interest not only for the insight 
they give on the equality of terms, but also for their mathematical 
structure. But the theory of D. is especially beautiful because of the limit 
characterization of equality of terms; see Section 7. 

Typical questions asked about terms are: 

(i) What kind of functions on terms are representable? 

(ii) Which terms are equal, which ones essentially different? Which 
terms can/should be equated? 

Restricted to numerals, the classical answer to (i) is: the recursive 
functions. Question (ii) can be approached by either giving consistency 
proofs for reasonable extensions of the A-calculus or by constructing 
models and considering the set of equations true in them. # and #* of 
Section 5 were found by the first and second method respectively. Also the 
questions under (ii) explain why extensionality is sometimes added to the 
A-calculus. Cf. the theorem of B6hm in 2.23, which states that the 
extensional theory is complete with respect to terms having a nf. 


The theory 
1.7. Derinition. The A-calculus has the following language. 


Alphabet: 

ay, @,,... variables 

—>, = reduction, equality; 

A, ), ( auxiliary symbols. 
Terms are inductively defined: 

(i) Any variable is a term; 

(ii) if M,N are terms, so is (MN); 

(iii) if M is a term and x a variable, then (AxM) is a term. 
Formulas: 

If M,N are terms, then M—N and M = WN are formulas. 


1.8. Conventions. A term of the A-calculus is called a A-term. M, N,... is 
a syntactic notation for arbitrary A-terms. x, y,Z... is a syntactic notation 
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for arbitrary variables. M,M,---M, stands for (-::(MiM;)--:M,) 
(association to the left). Ax,---x,.M stands for (Ax\(Ax2" + (Ax,M)--~)). 
The symbol = denotes syntactic equality. 

A variable occurs free in a term M if x is not in the scope of a Ax, 
otherwise x occurs bound. In this respect Ax has the same binding 
properties as Wx in predicate logic or [#---dx in calculus. We identify 
terms differing only in the names of their bound variables, e.g. Ax.x = 
Ay.y. FV(M) is the set of free variables in M, M is closed if FV(M) = 9. 

M{x/N] denotes the result of substituting N for the free occurrences of 
x in M. {n order to prevent confusion of variables we have to assume, as is 
the case in predicate logic, that no free variable of N becomes bound in 
M{[x/N]. This can be accomplished by renaming some of the bound 
variables in M, e.g. (Aa. ax)[x/a] =Aa'.a’a# Aa. aa. After this precau- 
tion, the definition of substitution is independent of the choice of represen- 
tative in the equivalence class of identified terms. See also DE Bruun 
[1972]. 


1.9. DEFINITION. The A-calculus is defined by the following axiom schemes 
and rules. 
I 1. (Ax. M)N—> M[x/N] (B-reduction), 
2. M>M, 
3. M>N, NOLS>M-L, 
4. (a) M—> M'> ZM-—ZM', 
(b) M> M'S> MZ > M'Z, 
(c) M>M'>Ax.M—>Ax.M'. 
M->M>M=M', 
M = M'>M'=M, 
M=N, N=L3>M=L, 
. (a) M=M'S> ZM=ZM', 

(b) M=M'> MZ = M’Z, 

(c) M = M'>Ax.M=Ax.M'. 

If M = N or M—N is derivable one writes At M=N or (AF)M—>N 
respectively. Since = is generated by —, the rules II.4 follow from I.4. The 
addition of II.4 is necessary however, if one considers extensions of the 
A-calculus. 


I. 


AwNe 


1.10. Extensionality 
The A-calculus can be extended by the following rule of extensionality, 


ext: Mx = M'x >M=M"', provided x € FV(M), FV(M’). 
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The rule ext can be axiomatized by adding the rule 


Ax.Mx—M_  (n-reduction) 


provided x € FV(M): 

If Mx = M'x, and x € FV(MM’), then Ax. MX = Ax. M’x, hence M = 
Ax. Mx = Ax.M’'x = M'. 

‘The A-calculus with this additional reduction rule is called the An- 
calculus. Provability in this theory is denoted by Ant ---. 

M and N are B(n)-convertible iff A(m) M=N. 


The models 

Our main object of study is the A-calculus. Therefore we would want 
that the combinatory algebras are also models for this theory, i.e. that there 
is an interpretation of the A-operator. However unless a combinatory 
algebra is extensional, there is a choice for the element f representing the 
function A in 1.3. Thus the combinatory algebras have not enough 
structure to be models for the A-calculus. 


1.11. Derinition. A pre-A-algebra is a combinatory algebra XY together 
with a method of assigning to each term A € T(2t) aterm A*x.A € T(M) 
such that 

(i) x does not occur in A*x.A, 

(ii) DER (A*x.A)x =A. 


Remarks. (1) For most Jt this assignment A » A*x.A is provided for by 
the proof that Mt is a combinatory algebra. _ 

(2) Although the definition of a pre-A-algebra is not formulated in a 
conventional first order way, from a constructive point of view it is 
completely clear. 


1.12. DEFINITION. Interpretation of A-terms. Let p be a valuation of the 
variables into a pre-A-algebra Dt=(X,-) ie. p = {ao,ai,...}—> X. The 
value in Mt of a A-term M under the valuation p, notation [MJ], is defined 
in two steps. First M is transformed into a term ((M))” € T(M). Then this 
term is interpreted in M under the valuation p in the usual first order way, 
yielding [MJ}’. 

' «M))y™ is inductively defined as follows: ((x))"=x; ((MN))"= 
(M)Y CN)"; (Ax. M))" = A*x.((M))™. 
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1.13. DeFinitTIoNn. Satisfaction. As usual, Ite M=N_ iff for all op, 
[MI], =ENT'. 


1.14. Derinition. A A-algebra or model of the A-calculus is a pre- 
A-algebra IM such that AF M=N>MEMEN. 


Remark. The term model (CL) of the theory of combinators is a 
pre-A-algebra (by the proof of 1.6) but not a A-algebra, since 
MCL) K Ax. ((Ay.y)x) = Ax.x: s(ki)iA i. 


1.15. DEFINITION. (i) A weakly extensional (w.e.) A-algebra is a A-algebra 
Yt such that 


MEM =M'> ME Ax.M = Ax. M’. 


(ii) A A-algebra IN is extensional iff M satisfies Wx (fx = gx) > f=g 
(extensionality). 


1.16. REMARK. (1) An extensional A-algebra is cleary w.e. 

(2) A combinatory algebra satisfying extensionality is an extensional 
A-algebra, since there is only one way to define abstraction. 

(3) There are interesting A-algebras that are not weakly extensional, e.g. 
°(A ) (see below) as follows by the w-incompleteness of the A -calculus (cf. 
PLorkINn [1974]) or Pt. 

(4) The only A-algebras that are considered here are either w.e. or term 
models. 


1.17. General concepts and notations 


w denotes the set of natural numbers. Ax.--- denotes the mapping 
x +++ (meta lambda). 


Notions connected with theories 

A is the set of A-terms. A° is the set of closed A-terms. Let T be a set of 
equations between A-terms. Then A + T is the A-calculus extended with 
the equations in T as axioms. T* = {M = N|M,NEA° and A+THM = 
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N}. T is said to be consistent if T’ does not contain every equation. A 
A-theory is a consistent set of equations T such that T=T". 

A is the A-theory {M=N | M, N € A° and A+ M = N}; the consistency 
is shown in 2.9. 

For a A-theory T define M ~,N iff A+ THM =N. ~- is an equivalence 
relation and let [M]~* denote the equivalence class of M with respect to 
~ ,. The term model of T, YU(T), is the A-algebra consisting of all A-terms 
modulo ~7, with application and abstraction defined canonically. The 
closed term model of T, M°(T), is the set of terms without free variables 
modulo ~7. 

A A-theory T is maximally consistent if T has no proper consistent 
extensions. 


Notions connected with models 

For a A-algebra Yt, Th(M) is the A-theory {M =N|M, N€A° and 
tt M = N}. The consistency follows from the fact that Card(2) > 1 (see 
2.3). 

The interior of Dt, notation M°, is the substructure consisting of the 
images in Wt of the closed A-terms. Up to isomorphism I° = M°(Th(M)). 
Mis hard iff Mt = WM’. The hard A -algebras are the prime structures among 
the A-algebras. 

For A-algebras a homomorphism h : P— Mi should not only preserve 
application, but also abstraction, i.e. for a term A € T(M), h(A*x. A) = 
A*x.hA in Wt’ where for B € T(M), hB E T(M’) is the term obtained by 
replacing in B all constants c. by Cha. 

We will use homomorphisms only in connection with term models. 
There the description is simple. If SCT are A-theories, then a 
h : DUS)—> MT) is defined by h({M]s)=[M]-=. Thus each (closed) term 
model (S) is the homomorphic image of M(A):[M]~* > [M]°s. If S is 
maximally consistent, then 2°(S) is algebraically simple, i.e. has no proper 
homomorphic images. 


2. Classical A-calculus 


The classical theory is mainly concerned with Yt(A). Among others the 
following theorems will be proved. All recursive functions are A -definable 
(Kleene). The set of terms with(out) a normal form is undecidable 
(Church). There is no recursive model for the A-calculus (Grzegorczyk). 
The last two theorems follows most easily from a theorem of Scott. Finally 
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the theorem of BGhm is stated, which shows the completeness of By- 
conversion for terms having a normal form. 


2.1. Fixep Point THEOREM. For every F € A there is an M € A such that 
AF FM =M. 


Proor. Define w=Ax.F(xx) and M=aw. Then AM =ow = 
(Ax. F(xx))o = F(@w)= FM. 


Remarks. (1) The fixed points can be found in a uniform way: let 
Y = Af. (Ax. f(xx)) (Ax. f(xx)); then At Yf = f(YF). 

(2) Since the theorem holds for terms possibly containing free variables, 
each element of a A-algebra has a fixed point. 

(3) Curry calls Y the paradoxical combinator. 


Note that in 2.1, A+ M— FM. This explains why the related construc- 
tions in the recursion theorem or Gédel’s self-referential sentence are 
somewhat puzzling, cf. BARENDREGT [to appear], 36.7. 


2.2. Frequently we need some standard terms. Let I = Ax.x, K = Axy.x, 
S=<Axyz.xz(yz) and Q=(Ax.xx)(Ax.xx). Then AtIM=M, 
A+ KMM = M and A+ SMNL = ML(NL). 

From 1.6 it follows that each closed term can be defined in terms of I, K 
and S. 


2.3. Truth values t (true) and f (false). Define t= K, f= KI. Then 
AttMN = M and AF fMN=N. Note that t4# fin any A-algebra M, for 
otherwise Yt would satisfy x =txy = fxy = y and hence be trivial. 


2.4. Conditional 

If B is a term taking values t and f, then the intuitive value of “If B then 
M else N” can be represented by BMN. 
2.5. Ordered pairs 

Define [M,N]=Ax.xMN, (M)=Mt and (M),=Mf. Then 
A F ([Mo, Mi): = M, i= 0, 1. 
2.6. Numerals 


Define 0= J and n+1=([n, K]. The numerals are chosen this way to 
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provide a convenient base for the representation of the recursive functions. 
Cuurcu [1941] used the following numerals: n = Afx.f"x, where f°x =x 


and f"*'x = f(f"x). 


2.7. DEFINITION. (i) A redex is a term of the form (Ax. P)Q (in the 
extensional theory also (Ax. Px), with x ¢ FV(P), is a redex). 

(ii) A term M is in normal form (nf) iff there is no subterm of M which is 
a redex (if it is necessary to distinguish nf’s in the A- and An-calculus one 
talks about B- and By-nf’s). 

(iii) A term M has a nf iff for some N in nf Ab} M—N (in the 
extensional theory An +|M—N). 

Intuitively a term is in nf if it cannot be computed any further. 


Example. (Ax.xx)y has the nf yy; Q has no nf. Note that for every 
natural number n,n is in nf. 


Now an important theorem on reduction will be stated. For details see 
e.g. HINDLEY et al. [1972] p. 139 or BARENDREGT [to appear]. 


2.8. CHURCH-ROSSER THEOREM. If A | M =N, then for some Z,A+M—>Z 
and A+ N-—Z (and similarly for the An-calculus). 


ProorF (outline; after Tait and Martin-Lof). The theorem follows from 
(and in fact is equivalent with): 


(*) If M—>N., M—N,, then for some Z, Ni, Z and N.—> Z: 


ra 
Ni N2 
\ 


\ s 


diamond property for >. 


That 2.8 follows from (*) is proved by induction on the length of proof of 
ALM=N, (*) being needed for the transitivity of =. 

(*) is proved by defining a relation -> on A-terms, such that (i) -> has the 
diamond property; (ii) — is the transitive closure of >. The diamond 
property for — then follows by a simple diagram chasing. 

The relation > is defined inductively as follows: 
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M->M: M->M’', N?N'> (Ax. M)N > M'[x/N‘]; 
M->M’',NSN'>MNOM'N': M -> M'> Ax.M->Ax.M'. 


Then (ii) is obvious and (i) follows from a case analysis and M -> M', 
N-> N’> M[x/N]-> M'[x/N’], as can be proved by induction on the 
generation of >. O 


2.9. COROLLARY. (i) If M has a nf at all, then M has a unique nf. 
(ii) If At M = N and N is in nf, then MN. 
(iii) The A-calculus is consistent, i.e. A ¥ M = N for some equation. 


Proor. (i) Note that if N is a nf and N—N’, then N’=N. Now suppose 
M—>N,, MN, and N,, N; are nf, then A+ N, = N:, so by 2.8, Ni>Z 
and N,— Z. But then N,=Z=N). 

(ii) If M = N, again M — Z and N — Z. But since N is anf, N = Z, so 
MN. 

(iii) If M,N are distinct nf’s, then A ¥ M = N by (ii) and (i). O 


2.10. DEFINITION. Let w be the set of natural numbers. A function 
f:w"—a@ is A-definable iff for some F € A°, 

(*) AL Fk, --+k, =m f(ki,..., kn) =m. 

If (*) holds, then f is said to be A-definable by F. 

2.11. REMARK. If for some A-term F instead of (*) we have 

(#*) f(ki,..., kn) =m DAEFK,-+ +k, =m, 


then f is A-definable by F: suppose A} Fk,---k, =m and f(ki,...k,) = 
m'. Then by (**), A + Fk, +--+ k, = m' and hence by 2.9(i) m = m’, since the 
numerals are in normal form. So f(ki,...,k,) =m. 


2.12. LEMMA. The initial functions U7(x,,...,%.) = Xi, Z(x)=0, S*(x)= 
x +1 are \-definable. 


Proor. Take as defining terms U7?=Ax,:--x,.x, Z=Ax.0 and S*= 
Ax.[x, K] respectively and use 2.11. 0 


2.13. Lemma. The A-definable functions are closed under composition. 


Proor. The representation of the composition is the composition of the 
representations. [ 
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2.14. Lemma. There are terms P and Zero, such that P(S*x)=x and— 
Zerox =t if x =0, Zerox =f if x is a numeral #0. 


Proor. Take P = Ax.(x)o, Zero = Ax.(x):ft. OF 


2.15. Lemma. The A-definable functions are closed under primitive recur- 
sion. 


Proor. For simplicity we omit parameters. Let f be defined by f(0) =k, 
f(n+1)= g(f(n), n), where g is A-definable by G. We want to define F 
such that it satisfies Fx = if Zero x, then k, else G(F(Px))(Px). By 2.4 this 
can be expressed as Fx =Zeroxk[G(F(Px))(Px)] or F= 
Ax. Zero xk [G(F(Px))(Px)]. Define © = Afx. Zero xk [G(f(Px))(Px)]. 
Then we can take F as being the fixed point of 0. O 


2.16. LEMMA. The A-definable functions are closed under minimalization. 


Proor. (Again we omit parameters.) Let f be defined by f(x)= 
uy[g(x, y) = 0], where g is A-defined by G and Vn dm g(n, m) = 0. As in 
2.15 we can find a A-term H such that Hxy = if Zero(Gxy), then y, else 
Hx(S*y). Then take F = Ax. Hx0. O 


Remark. It is clear that the A-calculus is a recursively axiomatizable 
theory. Hence (after Gédelization) the relation {(M, N)|A +t M = N} is 
recursively enumerable. 


2.17. THEOREM (Kleene). The A-definable functions are exactly the recur- 
sive functions. 


Proor. If f is recursive, then f is A-definable, since the recursive functions 
are the least class containing the initial functions which is closed under 
composition, primitive recursion and minimalization. If f is A-defined by F, 
then f(ki,...,k,.)=m@AtFk,--+k, =m hence by the preceding re- 
mark the graph of f is r.e., so f is recursive. O 


Remark. A partial function #: w* — wo is A-definable iff for some term F, 
w(kK)=mMeatk Fk = m, and &(k) undefined © Fk has no nf. 

It can be shown that for partial functions, w is A-definable iff y is partial 
recursive. In order to do this one must show that certain terms have no nf. 
For this purpose the following reduction strategy is useful. 
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2.18. DEFINITION. Let M be a A-term. If M is not a nf, let M’ be obtained 
by reducing the leftmost redex in M (i.e. the redex with its A as left as 
possible), else M’ is undefined. Define M,= M, M,.:=(M,)’. The se- 
quence M,— M,—--- (finite or infinite) is called the leftmost or normal 
reduction chain of M. 


2.19. NORMALIZATION THEOREM (Curry). M has a nf iff the leftmost reduc - 
tion chain of M is finite. 


Proor. See Curry et al. [1958] p.142. 0 


The theorem is false for arbitrary reduction chains, consider e.g. KIO 
having a nf but also an infinite reduction chain. 


Undecidability results 

After coding syntactical objects as natura] numbers one can speak of the 
decidability of a set of terms or equations. It will be shown that the 
A-calculus is essentially undecidable, i.e. has no decidable consistent 
extension. 

It is standard to define a coding M— #M such that there are recursive 
functions Ap(#M, #N) = #(MN) and Num(n) = #¥n. The numeral #M 
will be denoted by 'M!. oo 


2.20. SECOND FIxeD Point THEOREM. For each F € A™ there isan X € A 
such thatAt F'X'= xX. 


ProoF. Let the recursive functions Ap and Num be A-defined by the terms 

Ap and Num. Define w = Ax. F(Apx(Numx)) and X = w'w!'. Then 
AtX = 0'w'—> F(Ap'w'(Num'o!))> Flo'w" = FX", 

since Num'w!—> "ow", 2 


A set  CA° is closed under equality iff ME and M=NEA> 
NEA. ACA’ is non-trivial iff <4A9 and JH A°. 


2.21. THEOREM (Scott). Let s CA° be a non-trivial set closed under 
equality. Then 3 is not recursive. 
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Proor. Let Mo € 4, M, € A°— xf. Suppose xf were recursive. Then there 
is a recursive function f : w — {0,1} such that f(#M) =0 iff M © &. Let f 
be A-defined by F. Define F’= Ax. If Zero(Fx), then M, else, Mo. Then 
F'('M') = M, if M © of and F'('M')= My if MZ &. AX EC A°F'X'= X, 
by 2.20. If X Ea, then X = F'X'=M, ¢ of and if X€ MA then X = 
F"X!=M,€, contradicting in both cases that & is closed under 
equality. O 


2.22. CorotLary (Church; Grzegorczyk). (i) The set of terms with(out) a 
nf is not recursive. 

(ii) There are no recursive A-theories. 

(iii) There are no recursive d-algebras. 


Proor. (i) {M | M has a (has no) nf} satisfies the condition of 2.21. 
(ii) Let T be a A-theory. Then {M | M =I €T} satisfies 2.21. Hence T is 


not recursive. 
(iii) If Ye were a recursive A-algebra, then Th(M) would be a recursive 


A-theory, contradicting (ii). O 


The following result shows the completeness of the Ay-calculus with 
respect to terms having a nf. 


2.23. THEOREM (Bohm). If M,N are different Bn-nf’s, then A + M = N is 
inconsistent. 


Proor (outline). If M#N are Bn-nf’s, then M, N have finite B6hm trees 
which are not n-equivalent, see Section 6. Hence by 6.8, M=N¢ #*. In 
fact it follows from the proof of 6.8 that in this case A+ C[M]= x and 
AtC[N] = y for some context C[-] and variables xAy. SoA+M=N 
tx =y, ie. A+M=N is inconsistent. 0 


For terms without nf this completeness result is false, see 7.2. 


3. Construction of the graph model Aw 


The first A-algebras were the lattice models D. constructed by Scott in 
1969, see Section 4. The graph model Pw is less involved and therefore will 
be described first. This model was found by PLorkin [1972] and, in a more 
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explicit form, by Scott. For connections of this model with recursion 
theoretic ideas, see Scott [1975a], [1976}. 

The role of continuity in the model construction was already mentioned 
in Section 1. The topology on Pw is such that a continuous function 
Pw — Pw can be coded as an element of Pw. This is the essential feature 
of the model. 


3.1. DEFINITION. (i) Pw = {x |x Cw}. 
(ii) There are countably many finite elements of Pw. As an effective 


one-one enumeration of these sets we use e,, where 


en = {ko,.--5 Km} with ko<+*+<kn1 On = > 2", 


i<m 


(iii) (-,-) is the coding of pairs of integers into the integers defined by 
(n,m)=3(n+m)(n+m+1)t+m. 


3.2. DEFINITION. Let e Cw be a finite subset. 


O. = {x € Pw |e C x}; O, = O.,. 
3.3. LemMMA. The {O,}ne. form a base for a topology on Pw. 
Proor. O.N O. =O... OF 


Henceforth we always will consider Pw provided with this topology. 


én, Cx}. 


3.4, Lemma. f : Pw > Pw is continuous & f(x) = Uf{f(e,) 


Proor. (=>) First note that f is monotonic. For suppose x Cy and 
n € f(x). Then f(x) € O,,;. By continuity for some e, x € O, and f(O.)C 
Own. Now e Cx Cy, so y € O,, hence f(y) C Om, ie. n € f(y). Therefore 
indeed x Cy > f(x) Cf(y). 

By monotonicity of f, f(x)D U{f(e.)|e, C x}. To show the reverse, 
suppose n & f(x). Again for some e, x € O, and f(O.) C Oj). Hence since 
e € O., we have f(e)€ On, ie. n € f(e)C Uff(e,) |e, C x}. 

(<) Again f is monotonic. For suppose x C y and n € f(x). By the 
assumption, n € f(e) for some finite e C x. Hence n € Uff(e’)/e’C y} = 
f(y). 

Now suppose f(x) € O.,, i.e. e C f(x). Let e ={m,,...,m,}. By assump- 
tion m; € f(e,,) for some e,, Cx. Then f(O.,)C Oj) by monotonicity. 
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Then O= 0.,N---NO.,, is a neighborhood of x and 

F(OVE f(O.,) 1-1 F(O.,,) S Otmy V+ +O Opn = Oem... = Oe. 
So VO, 3 f(x) 4x > Of(O)C O., i.e. f is continuous. O 


By the previous lemma, a continuous function f is completely deter- 
mined by its values on the finite sets. Hence if one knows for what 
mnmEfi(e,), then the f(e,) are known and therefore f. Thus the 
information of a continuous function can be coded into a set. This 
operation is called graph. Its inverse operation is fun. 


3.5. DEFINITION. (i) Let f:Pw— Pw be continuous, then graph(f) = 


{(n, m)| m & f(en)}- 
(ii) Let u © Pw. The function fun(u) is defined by fun(u)(x)= 
{m | Fe, Cx(n,m)E u}. 


3.6. THEOREM. (i) A continuous function f is uniquely determined by its 
graph: fun(graph(f)) = f. 


(ii) For every u © Pw, fun(u) is continuous. 
ProoF. (i) 
fun(graph(f))(x) = {m | de, C x (n,m) € graph(f)} 
={m |de, Cxm E f(e,)} 
={f(e.)|e. Cx}=f(x), by 3.4. 


(ii) Let f=fun(u). Then f(x)= Ufu, 
{m [(n, mE u}. Now 


e.C x}, where u, = 


m € f(x) @ 3n[e, Cx nam Eun] & An[fle,Ce,am Eu,] re, Cx] 
€, © x}. 


© An[e, Cx am E f(e,)] & m € Uff(e,) 
So f(x) = Uff(en) 


e, © x} and 3.4 applies. O 
In general graph(fun(u)) = u does not hold, only 2D. 
3.7. DEFINITION. Application in Aw is defined by u - x = fun(u)(x). 
3.8. Lemma. A function f : Pw* > Pw is continuous iff f is continuous in 


each of its variables separately (i.e. Ax. f(x, yo) and Ay.f(xo, y) are 
continuous for all xo, yo). 
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Proor. (=> ) As ever. 
(<) It is sufficient to prove this for k = 2. Let f(x, y) be continuous in x 
and y separately. Then 


€n CX, Om Cy}. 


f(x,y) = Uff(em y)| en © x} = Uf f(en em) 


As in the proof of 3.4 <, it now follows that f is continuous in the product 
topology sense. O 


3.9. LEMMA (Continuity of application). Define Ap: Pw’—> Pw by 
Ap(u, x)= u-x; then Ap is continuous. 


Proor. Ax. Ap(u, x) = Ax(u.x) = fun(u) which is continuous by 3.6(ii). 
Au. Ap(u, x) = Au.(u.x)=Au{m | de, C x(n, m)€ u} 


which is clearly continuous. Now the result follows from 3.8. O 


k+l 


3.10. Lemma (Continuity of abstraction). Let f(x, y): Pw**'—> Pw be con- 
tinuous. Define g(¥)=graph(Ax.f(x,y)). Then g:Pw*—>Pw is 
continuous. 


Proor. For simplicity we set k =1. Note that by 3.8, g is well defined. 
Now 


g(y) =graph(Ax. f(x, y)) = {(n, m)| m € f(en y)} 
={(n,m)|m € Uff(e, e)|e Cy} ={(n, m)|Fec ym E f(ene)} 


= Uf{(n, m)| m & flew ete C y}= Utg(e)le Cy}. 
Hence 3.4 applies. O 


3.11. THEOREM. (Pw,-) is a w.e. A-algebra. 


Proor. For a _ continuous f:Pw**'>Pw, define A*d.f(d,é)= 
graph(Ad. f(d, é)). By 3.10 this is a continuous function in é and by 3.7 and 
3.6(1) one has (A*d.f(d,é)).d=f(d,é). Now for A= 
A(x, y1,.--, Ye) © T(Pw), where {y} = FV(A)— {x}, define A *x.A = ca. ¥, 
where a =A*e,---A*eA*d.(A (Ca Ce,---5 Ce,)””) and ---%” denotes the 
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interpretation in Aw. This makes Pw into a A-algebra. Since A* is defined 
by functions in extension, Aw with A* is w.e. OF 


Scotr [1975a] considers an extension of the lambda calculus, called 
LAMBDA, together with an interpretation in Pw. It is proved that the 
interior of Pw with respect to LAMBDA consists exactly of the recursively 
enumerable sets. 


4. Construction of D.. 


The results in this section are due to Scort [1972]. Again continuity is 
the essential feature in the model construction. 

First complete lattices and their induced topologies are considered. Then 
follows the construction of the A-algebras D.. as a projective (and at the 
same time direct) limit of these lattices. 


4.1. DeFiInition. Let D be a complete lattice, i.e. a partially ordered set 
(D,€) such that each subset X C D has a supremum |] € D. Then each 
subset X has an infimum as well: 


x = LI{z |zOX} where zOX © Vx © XzEx. 


Top, tT = UD, and bottom 1 = MD are resp. the largest and smallest 
elements of D. A subset X C D is directed iff Vx,y € X dz © Xx, yGz. 
Further, x Ly (x ly) is the supremum (infimum) of {x, y}. D, D’, D",... 
will range over complete lattices. 


4.2. DEFINITION. A subset UC D is open iff 
(i) x E U and xEy > y EU, and 
ii) UX € U > XN UF® for all directed X C D. 


D and § are open, and open sets are closed under arbitrary unions; if 
U,, U2 are open, then U;M U; is open by the fact that in (11), X is directed. 
Hence the partial ordering induces a topology on D. Note that the sets 
U, = {z|z@x} are open and x € U,. Therefore the topology is To: if x, y 
are different, say x Z y, then x € U,, y ¢ U,. The space is not T;: if x Ey 
and x € U, open, then y € U. 


4.3. Lemma. A mapping f: D— D' is continuous © f(LIX)=U f(x) 
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for all directed X C D (where f(X) = {f(x)|x € X} and the second LI is to 
be taken in D ). 


Proor. (>) Let f be continuous. Suppose x Cy in order to show 
f(x)C f(y). If not, then f(x)€ Uj, so x € f-'(U,y)) which is open. 
Therefore y € f-'(U,,)), i.e. f(y)€ Uj), a contradiction. Hence f is 
monotonic. It follows that, since LI X 3X, f(LIX)If(X). Therefore 
f(Ul x) a Uecx). te (Ul xy Ux), then f(LU X)€ Ung and a 
contradiction can be obtained as above, using condition (ii) for open sets. 

(<) Again f is monotonic, since if x Cy, then y=xUy, hence 
f(y) = f(x)U f(y), so f(x) f(y). Therefore if UCD’ is open, so is 
fi(UuycD. O 


4.4. Derinition. If f(LIX)= UI f(X) holds for arbitrary X, then f is 
called distributive. 


4.5. Corottary. Let f,: D— D' be a collection of continuous mappings. 
Define f =x. LU f.(x). Then f is continuous. 


Proor. f(LIX)= UU, Unexfi(x) = Usex Ui fie) = Up(x). 


4.6. Derinition. D x D’ is the cartesian product partially ordered by 
(x, x (y, y’) iff x E x’ and y € y’.[D— D’]is the set of continuous maps 
€D—D' partially ordered by fog @VxED f(x)E g(x). Then 
D x D' and [D— DJ are complete lattices with LI X = (LI (xX), LI(X),) 
for XC DxD’ and LIF=ax.U{f(x)|fEF} for FC[D>D’]. (If 
z=(x,y), 2=X, =; Lj F is continuous by 4.5.) The induced topology 
on D Xx D' is not necessarily the product of the induced topologies on D 
and D'. 


4.7. Lemma. f: D x D’— D" is continuous © f is continuous in each of its 
variables separately (i.e. Nd. f(d,do) and Ad’. f(do, d') are continuous for 
all do, do). 


Proor. (> ) As ever. 
(|) f(Ux)=fUX.,,U x) 
=H raUxy= OU fady= U fad =U). O 


EXy dex, 
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4.8. Lemma (Continuity of application). Define Ap _ (application) 
[D— D'|x D—D' by Ap(f, x)= f(x). Then Ap is continuous. 


Proor. Ax.fo(x)= fo is continuous. Af.f(x0)=ho with ho(LIF)= 
LI F(x) = User f (x0) = LJ ho(F). Hence ho is continuous. Therefore, by 
4.7, Ap is continuous. 0 


4.9. Lemma (Continuity of abstraction). Let f€[D x D’'— D"]. Define 
a(x)=Ay ED’. f(x,y). Then 

(i) gy is continuous, 

(ii) Af.g,:[D x D'’—> D"] > [D —>[D'— D"]| is continuous. 


Proor. (i) g/(LI X)=Ay.f(L Xy) = Ay. Lexf(xy) = Uxexay-f(sy) 
= LU g(X). ay.U = Lay follows from the definition of LJ in a function 
space.) 

(ii) Let L = Af.g, L(LIF)=Ax.ay LI F(x, y) = Ax. ay. Uyerf(x, y) = 
LI <pAx.Ay.f(x,y)=LUL(F). O 


4.10. Derinition. Let 6: D—D’, wy: D'—> D. (¢,) is a projection of D 
on D’ iff 
(i) ¢,% are continuous, 
(ii) Vx € DU(P(x)) = x, 
(iii) Vx € D’ d(W(x))E x. 


4.11. DEFINITION (Construction of D..). Let D be an arbitrary nontrivial 
complete lattice. Define D y=D, OD,.:=[D,—D,]. Mappings 
¢d, :D, > D+, and &, :D,+,—2 D, are defined as follows: 


dbo(x) = Ay € Do. x, wo(x') = x'(1), where 1 € Dy. 
Gn+(X) = Gn oxo Wn, Un+(X') = brex'ed, (see Diagram 1). 


Ym Yast 
D, — Dyas = D,,.+2 


|; |» x € Davi, x'E Dye 


D, => Das a Dy+2 
On on+l 


Diagram 1. 


By a straightforward induction on n it follows that (¢,, &,) is a projection 
of D,., on D,. Moreover @¢, and yw, are distributive. For an 
element x = (x, )z-o of the product II? _» D,, x; denotes the i-th coordinate, 
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Dz = lim(Dy, da) = {x €T] Da [Wn € othe (nes) = Hal 


For x, y € D., a partial ordering is defined by xb y@ Vn€ wx, Cy,. 
Then D. is a complete lattice with for X C D., LIX = (LI X,)z20. This 
belongs to Dz since (LI X,41) = Ly, (Xa+1) = LI X,, by the distributivity 
of u,. 


4.12. DEFINITION. Mappings ®,,,, :D, —> D, are defined (by following the 
arrows in Dp D,2:::). If nm, say m=nt+k, ®,,, is defined by 
induction on k. ®,, =Ax © Dy. x. Drone = Om? Pam. If m Sn, say n= 
m+k, ®,, is again defined by induction k: Dosim = Pam ° On 

®,..:D, — Dz is defined by ®,.(x) = (®,;(x))f=0. Pon: D.»— D, is de- 
fined by ®..,,(x) = x,. 


4.13. Lemma. (i) ForO<n Sm <= &, (Pin, Pan) is a projection of D, on Dn. 
(ii) For O=n =m </|/<0, D1 ° Bam = ®P,,. 


Proor. Standard. O 


It follows that up to isomorphism DoC D,C-:+C Dz. In fact in the 
category of complete lattices with continuous mappings as morphisms, D.. is 
not only the inverse limit lim(D,, w,), but also a direct limit: D.= 
lim(D,, ¢,). Note however D. # U,,D,, D. is the completion of U D,, 
Henceforth each element x € D, will be identified with ®,.(x) € D., as is 
customary with direct limits. In particular 


4.14, Lemma. (i) If x © D,, then x = Xp. 
(ii) If x E D,, then o,(x) = x. 
(iii) If x € D,.1, then ,(x)E x. 


ProoF. (i) x in D. is (..., &(x), x, 6(x), 6((x)),...). Hence x, is x. 
(ii) @.(x) in Dz is (..., b(bn(X)), bn(X), (Gn (x), -.-). Since W(b(x)) = 


x, this is the same as x in D.. 
(iii) Analogous, using ¢(#(x))E x. O 


Due to the identifications, properties of D. are more elegant to 
formulate: 


4.15. Lemma. In D.. 


1114 BARENDREGT/LAMBDA CALCULUS [cH. D.7, §4 


(i) (Xn) = Xmincn, m): 
(ii) Ifn =m, then x, € Xm EX. 
(iti) x = LIz_, x,. 
(iv) T, and 1, are top and bottom of D.,. 
(v) 1, =1; T. =T. 


ProoF. (i) If m =n, (Xn)m = PamXn = Wes oW(X,) = Xm, Since x E Dz. If 
m =n, (Xn)m = 62° +9 P(X.) = Xn by 4.14(ii). 
(ii) First, by 4.14 (iit), xm Do Xm+1, SINCE Xm = Wm (Xm+i). Hence xE xX, 
C:--, Furthermore x, € x since Vi (x,)i = Xminn i) & Xi 
(ii) Lx, = (Ud, (x0 = (LU, xmincn neo = (xi)Po = x. 
(iv) Let T;, and 1; be resp. top and bottom of D,. Then 
T= U D.= (U Dy Yr-0 = (T a-o 


and 


L= LI¢ = (LUI @)2-6 = (1Li)R-o 
Hence Tn = Ta, La = La. 

(v) By (ii), 1, C L and LC 1, always. So L, = 1. On the other hand it 
follows by induction that $,(T,)=Ta+i, since Tar: = Ax E Dy. To 
Moreover since (T ,)new © Dz, Wa(Tn+1)= 17 .. Therefore T, in D.,, i.e., 
(Pam (Ti D)avee is (T m)néw =T. O 


4.16. DeFinition. In D. one can define a binary operation application: 
x+y =LJ,,x,+:(y.). On the RHS the application is the usual one D,.1 
D, — D, and the LI is to be taken in D.. after identification. 


4.17, Lemma. Application is well defined, in the sense that if xXn+1€ Dyas 
and y, © D,, then Xn+i* Yn = Xn+i(Yn). 


PROOF. 


Xnst? Yn = u (Xn+i)iei (Yn )i) 
=U xly) dy 4.15(i), 
= Xn+i(yn) by 4.15(ii). O 
4.18. Lemma. Application is continuous. 


PRoocF. xy = LI), Pye(Xn+i(¥n)). Now apply 4.5, 4.8 and 4.13. 0 
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4.19. LEMMA. (i) Xns1° Y =Xn+1* Yu =(X + Yn)ne 
(ii) xXo- y = Xo = (x + L)o. 


ProoF. (i) First it is shown by induction on i=n that (Xn+1)in(yi) = 
Xn+1(¥n). For i =n this is clear. Now 
(Xn +ie2(Yier) = Bis ((Xn+r ivr) (Vier) = Bi ons rhivr ? Wi (Yin) 
Hi ((Xn+1)i+1(¥i)) = n+riri(yi), by 4.14(i), 
= Xn41(Yn) 
by the induction hypothesis. Therefore 


lI 


Xnti’ y = i) (Xn+1)i+i(yi) 
= CF xess(yn) = Hae Yn (by 4.17). 


Again by induction on i =n it is shown that (x:+1(Yn)i)n = Xn+i(yn). For 
i=n this is clear. Now 
(Xi+2( Yn itidn = Dyin (Xis2( Pi ((Yn)i))) = Di, © Wi? Xi+2 (Hi (Yn)i) 
= Di, ((Hisr(%i+2)) (Yn )i) = Bin (Xia i(Yn)i) 
= (xi41(Yn)in = Xnsi(Yn) 
by the induction hypothesis. Therefore 


Gee (u ied), = Us Gey) Dn = LI Bees Yn) = east Yo 


(ii) By (i), Xo y= (Xo): “y= (Xo)i(yo) = do(X0) (Yo) =X. Also Xo= 
Pu(X1) = x1( Lo) aa (x : Loo = (x " 1 )o. oO 


4.20. THEOREM (extensionality). For x,y € D., 
(i) xy @VZED.x:zy-z, 
(ii) x=y OWVzZEDwx-z=y°zZ. 


Proor. (i) (>) By monotonicity of Ax(x-z). (<) Suppose 
Wzx-zCy-z. Then x-iCy-l, so x=(x-Lpl(y-1)o=yo by 
4.1911). Moreover x°Z,0 y*2Zn, 80 Xnei(Zn) = (X + Zn)n C(Y * 2n)n = 
Yn+i(Zn) by 4.17 and 4.19(i). Hence x, +1 yu+1. Now we have Vnx, FE yn, ie. 
xCy. 

(ii) Immediate by (i). O 
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4.21. THEOREM (completeness). Vf €[D.— D.] 4x © D.f(y)=x-y. 


Proor. Let x = LU, (Ay & D,.(f(y)),). Note that this is the supremum of a 
directed set. Remark that for a, in a complete lattice, 


LI ay = LI Axx if Vij dk a, Ax. 
wi k 


Now 


xy =U mem) =U (+ yn dm =U (Ln ay © Dz (FON) + Ym). 


m 


= U (Ay © Du (f(y))n * Ym mn = U (Ay € Dn (f(¥))m * Ym) 


= L (F(¥m Im = U (fF (ye )e = LW f(y) = fly). 


Comments that should accompany these equations follow easily from the 
continuity (monotonicity) of the functions involved and the remark 
above. 0 


4.22. Coro.tary. Dz is homeomorphic to [D.— D.]. 


Proor. For x E€ D. let F(x)=Ay € D.x-y. F is surjective by 4.21, 
injective by 4.20, continuous by 4.9(i). The inverse to F is, by the proof of 
4.21, Af LU, (Ay © D, (f(y)).) which is continuous by 4.9(ii) and 4.5. O 


4.23. THEOREM. D. is an extensional d-algebra. 


ProoFr. Combinatory completeness follows from 4.21 since application and 
abstraction are continuous. Extensionality was proved in 4.20(ii). Hence 
1.16(2) applies. 


Since D.. is extensional there is no ambiguity interpreting A-terms in it. 
However for later reference, the interpretation will be given explicitly. 


4,24, DerFinirion. (i) A valuation (in D.) isa mapping p: variables > D.. 
(ii) For d€ D. and x a variable, p(d/x) is the valuation p’ with 
p'(y)=if yAx then p(y) else d. 
(iti) The interpretation of M in D., under p,[M]p, is defined inductively: 
[x]p = p(x), [MN]p =IM]pIN Ip, [Ax. Mp = Ad. [M]p(d/x) € D. after 
identification with [D.— D.]. 


Roughly, in the interpretation formal variables are replaced by variables 
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ranging over D.. and application and abstraction are to be taken in D.. 
From this it should be obvious that the interpretation is correct. This 
can be made precise by showing inductively [(AxM)N]p = 
IM]e (INle/x) =[M(x/N)lp. 

Whenever possible the valuation p will be omitted in the notation [M Jp. 


5. Solvability 


The concept of solvability and the related notion of head normal form 
were introduced respectively in the dissertations of Barendregt and 
Wadsworth. Both theses give arguments for the computational irrelevance 
of unsolvable terms. Therefore a A-theory (or A-algebra) is called sensible 
iff it equates all unsolvable terms. It turned out that there is a unique 
maximal sensible theory. In Section 7 it will be proved that this theory 
equals Th(D.). 


5.1. Derinition. (i) A closed term M is solvable iff A+ MN,---N, = I for 
some n and terms N,---N,. 
(ii) An arbitrary term M is solvable iff its closure AX. M is solvable. M 


is unsolvable iff M is not solvable. 
(iii) % ={M = N|M,N unsolvable}. 


To see the particular role of I in this definition, note that M (closed) is 
solvable iff WP IN MN = P. 


5.2. DEFINITION. (i) Each A-term M is of the form 


Axy+ ++ Xne(Ax.P)QM,:°°>M,, or AXi+ ++ Xn. XiMi- ++ Ma, 


m,n = 0. In the first case (Ax. P)Q is the head redex of M. In the second 
case x; is the head variable of M and M is said to be in head normal form 
(hnf). 


The following can be proved syntactically. A semantic proof will be 
given in 7.9 and 7.10. 


5.3. THEOREM. (i) M is solvable & M has a hnof. 
(ii) An + # is consistent. 


5.4. Examp_es. I and Y are solvable; 2 and K*= YK are unsolvable 
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(note that At Y(KI)=I, ON,:--N, 27M > M=ON;-:-N,, with 
N,— Ni, and A+ K*M = K* for all M). Note alsoA + #HFAx.Q = Ox = 
, since M unsolvable > MN, Ax, M unsolvable. 


5.5. DEFINITION. (i) A context C[-] is a term with some holes in it. More 
formally: any variable x is a context; [-] is a context; if C,\[-], C.[-] are 
contexts, so are (C,[ -] C.[ -]) and (Ax. C[-]). If M is an arbitrary term and 
C[-]acontext, C[M] denotes the result of placing M in the holes of C[- }. 
In this act, free variables of M may become bound in C[M]. 

(ii) M and N are solvably equivalent, notation M ~,N iff VC [-][C[M] 
is solvable < C[N] is solvable]. 

(iii) #* ={M=N|M, NEA° and M ~,N}. 


5.6. LEMMA. #* = #**. 


Proor. Induction on the length of proof shows that A+ #*+M=N > 
M~,.N. O 


5.7. COROLLARY. #* is consistent hence a d-theory. 
Proor. 17,2, hence L=QNEH*. O 
5.8. Lemma. If 3+ M =N is consistent, then #*+M=N. 
Proor. First note that {I = K*} is inconsistent: A+ I= K”*+M=IM = 
K*M = K* for all M. Now suppose M=N¢€ #*. Then, say, C[M] is 
solvable and C[N] is unsolvable for some C{[-]. Therefore 
A+ (AX. C[M])P =I for some P and #+ C[N] = K®. Now 

A+ H+M=NEI=(Ak.C[M])P = (Ak. C[N])P 

= (A¥. K*)P = K°P = K”, 

soA+#+M=N would be inconsistent. O 


5.9. COROLLARY. (i) #* is the unique maximal d-theory extending H. 
(ii) Moreover, #* proves extensionality. 


Proor. (i) Since Con(#), by 5.3, #C #* by 5.8. If T is a consistent 
extension of #, then foreach M=NET,M=NE2HX* by5.8;soTC #*. 
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Therefore #* contains each consistent extension of # and being itself 
consistent, 5.7, the statement follows. 

(ii) A+ # + (Ax.Mx)=M (x € FV(M)) is consistent by 5.3(ii). Hence 
H*-Ax.Mx=M. O 


The possibility that # has a unique maximally consistent extension is 
due to the fact that the language of the A -calculus is logic free, i.e. it is not 
possible that for an undecided sentence o we make two extensions by 
adding o and —o respectively, because the language does not contain 
negation. The theory A however has 2" maximal consistent extensions. 


5.10. DeFiniTIoN. A A-algebra I is sensible iff WE H. In that case 
Th(M) C H#* by 5.9. 


The ‘“‘least”’ sensible model is I°(H#*): 


5.11. Corottary. M°(H*) is algebraically simple, i.e. has no proper 
homomorphic images. 


Proor. If were a proper image of M°(#*), then Th(M) would be a 
proper extension of #*. O 


In Section 7 it will be shown that 2°(#*) is the interior of D.. 


6. Bohm trees 


The trees introduced in this section are inspired by the proof of the 
Theorem of Bohm, 2.23, and the concept of solvability. The Bohm trees 
are useful for the analysis of Aw and D... See NAKAJIMA [1975] for a related 
family of trees. 


6.1. DEFINITION. A tree is a set A of sequence numbers such that 
(i) if a € A and B < a (ordering of sequence numbers), then B € A, 
(ii) for each a € A there are only finitely many immediate successors of 
a in A. The a€A are called the nodes of the tree. The depth of 
a =(n,...,M%-1), notation d(a), is k. * denotes concatenation, i.e. 
(n)*(m) = (A, m). Then subtree at a of A, A,, is the set {B | a*BEA}. 


6.2. DEFINITION. (i) A labelled tree is a tree where at each node there is a 
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symbol which is either @ or Ax,:+-+x,.x; for some variables xX1,..., Xn, Xi. 
To be precise, a labelled tree is a mapping f, from the sequence numbers 
into the set 


{*, QU(Mi,..., i), i] in, is, -.., in E eh, 


such that A = {a | fa (a) = +} is a tree. fa(a) = ((in,...,i,),8) (esp. 2) 
means that Aai,-::a@,.a; (resp. 22) is written at node a EA. 
(ii) If A, B are labelled trees, then 


A =,B @ Va [[d(a)<k > fala) = fala) a 
A[d(a)=[k > fa(a)A# *& fa(a) # *]]). 


i.e. for depth <k the labelled trees are equal and the nodes of depth k — 1 
have in both trees the same number of successors. 


6.3. DEFINITION. The Bohm tree of a A-term M, BT(M), is a labelled tree 
defined as follows: If m is unsolvable, then BT(M) = 22. If M is solvable, 
say M has hnf Ax,---x,.x.M,--+M,,, then BT(M) is 


AX Xqe Xi 
a 
BT(M,):-- BT(M,, ) 


To be precise, if M is unsolvable, then BT(M)({-)) = 2, BT(M)((j) * a) = 
*. If M is solvable, say M has hnf Aai,--:a@,,.a@iMo-+:>Mn-—1, then 
BT(M)((-)) = (is,..., i), 4), BT(M) (7) * a) = BT(M,) (a), for j <m and 
BT(M)((j)* @) = * for j =m. 

Free and bound occurrences of a variable in a BOhm tree are defined as 
for terms. As with terms, BGhm trees are considered modulo a change of 
bound variables. 


From the Church—-Rosser theorem it follows that if M has a hnf 
AX1+ ++ XneXM,--+M,,, then n,i,m are uniquely determined. Hence the 
Bohm tree of M is well defined and if A | M = N, then BT(M) = BT(N). 


6.4. EXAMPLE. 


BT(S); Aabe.a.  BT(SxQ): axix. BT(Y): 9 Aff. 
| 


cb c 2 i 


a 
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Note that although Ax.x and Axy.xy are extensionally equal, they have 
different Bohm trees. Therefore the following equivalence relation is 
introduced. 


6.5. DEFINITION. (i) M’, N’ merge BT(M), BT(N) up to k iff An + M = M’, 
An + N =N’ and BT(M') =, BT(N’). 

(ii) M,N have equivalent Bohm trees, notation BT(M) ~, BT(N), iff 
Wk 3M',N’ M',N’ merge BT(M), BT(N) up to k. 


Now we give some examples of terms with equal or equivalent Bohm 
trees. 


6.6. Example. (i) By the fixed point theorem there exists a term A such 
that Ax > Az.z(Ax). Then BT(Ax) = BT(Ay) (x and y disappear from 
the tree). 

(ii) Let Y, = Af. (Axz. f(xxz))(Axz. f(xxz))z. Then Y, is an alternative 
fixed point operator not convertible with Y. But BT(Y.) = BT(Y). 

(iii) BT(Ax.x)~, BT(Axy. xy). 


The following is a less trivial example of equivalent Bohm trees. 
Together with the characterization Theorem 7.1 it shows that in D. a 
normal form may be equal to a term without a nf. 


6.7. Example (Wadsworth). Let J = Y(Ajxy.x(jy)). Then BT(J)~, 
BT(J). 


Proor. The hnf of J is Axox,. Xo(Jx:), so BT(J) is 


AXoX1- Xo 


AX2.Xy 


AX. X2 


This can be merged to any depth with BT(I) by some 7 expansions (i.e. the 
opposite of a contraction) of the latter. UO 


6.8. THEOREM. #*+M=N > BT(M) ~, BT(N). 
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The rather technical proof occupies the rest of this section and can be 
omitted at a first reading. 


6.9. DEFINITION. Let A, A’ be Bohm trees. 
(i) A, A’ are top mergeable iff either A, A’ are both Q or A and A’ are 


AX Xn Xi AX Xn 


To Ng 


respectively, and i = i’ and n- m = n'-— m' (possibly negative numbers); 
the sequences x,,...,x, and x,,...,x,-can be assumed to start similarly, by 
a change of bound variables. 

(ii) Let a be a common node of A, A’. A, A‘ are mergeable at a iff 
A, Aq are top mergeable. 

(iii) A, A’ separate at depth k iff A =, A‘ and A, A’ are not mergeable 
at some common node a with d(a)=k 


6.10. Lemma. Let BT(M)=, BT(N). If BT(M), BT(N) are mergeable at 
all a with d(a) =k, then they can be merged up to k +1. 


Proor. Let d(a)= k. BT(M).,, BT(N). are, say, 


AX Xue Xi ANd AXyps ++ Xp. XH. 


10° + Om oe ee 


Now make an -change as follows 


AX Xn Xue 7k ee Se AN. 
OnXner 0X nt+n’ Eh Bae Xn 23 Xan 


Note that m + n'=m'+n by the mergeability at a, so after the change, @ 
has the same number of successors in both trees. After this change is made 
for all a with d(a)=k, the resulting labelled trees are Bohm trees of 
M',N', say, which merge BT(M), BT(N) up tok +1. O 


6.11. CoroLiary. If BT(M)%, BT(N), then 4k, M',N’ such that M', N' 
merge BT(M), BT(N) up to k and BT(M’), BT(N’) separate at depth k. 


Proor. Let k be maximal such that BT(M), BT(N) can be merged up to k, 
by some M’, N’. Then BT(M') =, BT(N’). Suppose that BT(M’), BT(N’) 
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are mergeable at all a with d(a) =k. Then by 6.10, BT(M’), BT(N’), and 
hence BT(M), BT(N), can be merged up to k +1, contradicting the 
maximality of k. Therefore BT(M’), BT(N’) separate at depth k. 0 


6.12. DEFINITION. (i) A transformation is a mapping f: A — A. 

(ii) A solving transformation f is either defined by f(P) = Px for some x 
or by f(P)= P[x/Nx] for some x and closed N. 

(iii) A Béhm transformation is a finite composition of solving transfor- 
mations. Notations: 7 ranges over Bohm transformations; M” = 7(M). 


6.13. DEFINITION. BT(M) is head original up to k iff BT(M) has a free 
head variable which does not occur freely at any other node a with 
d(a)sk. 


6.14, Lemma. If BT(M), BT(N) are head original up to k and separate at 
depth k >0, then for some 7, BT(M”), BT(N”) separate at depth k — 1. 


Proor. Let the trees separate at node a: 


AX Xn Xi 


depth k 


Define m(P)= Px,:+-x,[%/Uj"x,], where Uj = Ayits'ym+¥- Then 
BT(M”), BT(N”) separate at depth k — 1: 


depth k — 1 { { 


The assumption of head originality is needed to insure that the difference 
A, A is not lost by the substitution [x,/U7'x,]. 


6.15. Lemma. Let BT(M), BT(N) separate at depth k >0. Then for some 
7, BT(M") are head original up to k and still separate at depth k. 


Proor. Let Cy = AZo++ + Zq+)+Zq+iZ0*** Zq. For a node @ in a Bohm tree let 
#a be max(s, tf) where Aai,: +: a;,. a; is the label at a and ¢ is the number of 
successors of a; ¥a@ is 0 if 2 is the label at a. The assumption implies that 
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M,N have hnf’s, say Ax.--:x,.%:Mic+>-M, and Ax.-+-Xn.XiNie ++ Nn 
respectively. Now define 

a (P) = (Pxi ++ Xn) [Xi CX Zmei * Zq41s 


where q >2(#a) for all a in BT(M), BT(N) with d(a)=k and 
Zm+1°** 2q+1 are fresh variables. Note that the Bohm trees at depth =k of 
M”,N” result from those of M,N respectively by replacing the tops 


AX, Xn Xi b Zq+1 


aA 


Pia * Celaya: 
and the internal nodes with free head variable x; 


“yseXi DY Ayres + VeWrer so Wa41e Was 


() “a PA 


O, x;O7- OWi+i* 


(here q >t is used). Clearly BT(M”) and BT(N”) are head rere 
up to k. 


Claim: These trees separate at depth k. 

Since BT(M)=, BT(N), also BT(M")=, BT(N”). By assumption 
BT(M), BT(N) are not mergeable at some node a@ with d(a@) = k. Consider 
the B6hm trees 


BT(M) x BT(M") BT(N") 


depth k node ees, 


Suppose A and A have both the free head variable x; Referring to 
(*), A* and A* are top mergeable iff s+(qt+t1-t)-(l+q)=s'+ 
(q'+1-0¢’)-(1+q’) iff s -—t = s'-t' which is not the case since A and A 
are not top mergeable. Hence BT(M”), BT(N”) are not mergeable at a, 
i.e. separate at depth k. The same conclusion holds in the other cases (x; is 
head variable of A, not of A (q >s s'+t insures that A” and A’ have 
different head variables and hence are not top mergeable); x; is not head 
variable of either A or A). O 


6.16. Coro.cary. If BT(M), BT(N) separate at depth k >0, then for some 
am, BT(M”), BT(N”) separate at depth k —1. 
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Proor. By 6.14 and 6.15. Of 


6.17. Lemma. If BT(M), BT(N) separate at level 0, then for some 7, M” is 
solvable, N” is unsolvable, or conversely. 


Proor. If M is unsolvable, BT(M)= 2 and hence N is solvable since 
BT(M), BT(N) differ at level 0. In this case take 7 the identity. If M, N are 
solvable, say M—>Ax.-++XneXiMi-+>* Mm, NAXi tt Xn Xi Niet Nav, 
then by assumption i4 i’ or n—m#n'—m'. In case i#i’ take P” = 
P[xi/(Ayi- ++ ym D)xi} [x+/2x,}. Then M” = I and N” =Axi+ ++ x,.Q+°+ = 
Q(mod #) and hence unsolvable. In case i = i’ and n— mA n'—m', let 
P= Px,--+x,2 with p large enough (= max(m, n, m',n')). Then M"= 
XiMi +++ MyXneit  XpQ. N= xXiNi--+ NwXnaei?* + Xp. The length of the 
sequences after x, is m+p—n+1, m'+p-—n'+1 respectively which 
differ since n- m#n'—m’. Hence by defining 7 = m,° 7, with P™ = 
P{x,/U3x,] where U?=Ay,---y,.y; iS an appropriate selector, the re- 
quired 7 is found. (] 


PROOF OF THEOREM 6.8. Suppose M = N € #*, but BT(M) 4, BT(N). By 
6.11 for some M’, N’, Ay + M = M', N = N' and BT(M’), BT(N’) separate 
at depth k. By iterated application of 6.16, BT(M’”), BT(N’”) separate at 
depth 0 for some 7. Hence it follows from 6.17 that for some 7,, M’"' is, 
say, solvable and N’” is unsolvable. Now for each z there is a C,[-] such 
that 7(P)= C,[P] (a substitution M[x/Nx] can be written as C[M] with 
C[-] = (Ax.[-])(Nx)). So C,,[M’'] is solvable and C,,[N’] is unsolvable, 
hence #* ¥ M'=N' by the proof of 5.6. But this contradicts #*+ M =N, 
since #* proves extensionality (see 5.9(ii)). O 


7. Analysis of D. 


Using Bohm trees it is possible to give an elegant characterization of 
equality of terms in Aw and D.. 


7.1. THEOREM. (i) D-EM=N © #*+M=N © BT(M)~, BT(N). 
(ii) Pw EK M = N & BT(M) =BT(N). 


The first result is due to Hyland and Wadsworth, the second to Hyland. 
We will only present the proof of (i) (see 7.16). See HyLanp [1976] or 
BaRENDREGT [to appear] for the other proof. 
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7.2. COROLLARY (see 6.6 and 6.7 for a definition of the terms 
involved). The equalities Ax = Ay, Y, = Y, J = Ihold in D. but cannot be 
proved algebraically, i.e. by conversion. 


Proof. The equality in D.. follows by 6.6, 6.7 and the theorem. By the 
Church-Rosser theorem the equalities are not provable in A. O 


7.3. DeFiniTion. (In the sequel [-] is [-]?-.) The AQ-calculus was intro- 
duced by Wadsworth as a tool for examining D... AQ-terms are defined by 
adding to the formation rules of terms (see 1.7): Q is a term. The 
interpretation in D. is extended to AQ@-terms by setting [M], = L. 
Reduction for AMQ-terms is ordinary reduction extended with the axioms 
Ax > 2 and QP->. A AN-term P has an N nf Q iff P—> Q isan N 
reduction and Q has no subterm (Ax. R)S, AxM or QR. Note that 2 
reduction preserves the value in D., since Ld = 1, hence also Ax. 1 = L. If 
P has a nf in the original sense, then P has an nf, since replacements 
AxQ—- or QR—-M decrease the length of a term. Bohm trees of 
AQ-terms are defined by letting BT(2) = 2. 


7.4, DEFINITION. Approximation. 

(i) Let P,Q be AN-terms. P approximates Q, notation P € Q, iff BT(P) 
results from BT(Q) by replacing some subtrees by the tree N (e.g. 
Ax. x2 © Ax. IxK). 

(ii) Let M be a A-term. P is an approximate normal form (anf) of M iff 
PCM and P is an 2 nf. 

(iii) A(M)={P|P is an anf of M}. 

(iv) M* is the anf of M such that BT(M*) results from BT(M) by 
replacing all labels at nodes of depth k by Q and cancelling the deeper 
nodes. Note that M* € A(M). 


7.5. EXAMPLE. Af. f(Q)CE Af. f(Ax. f(xx)) (Ax. f(xx))) hence Af. f(Q2) is 
an anf of the fixed point combinator Y (defined after 2.1). In fact 


Y“ = Af. f*(Q), A(Y)={Q(<Af. 22), Af. fQ, Af. ff), ...}. 
7.6. Lemma. PE OQ > [PICO]. 
Proor. P is Q with some subterms replaced by Q. Now [Q] = 1, the least 


element of D.. The result follows since application and abstraction in D. 
are monotonic (by their continuity). O 
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The following theorem is quite useful for determining the value of 
A-terms in D.. The proof will be given in 7.18-7.24. 


7.7. APPROXIMATION THEOREM (conjectured by Scott, proved by Hyland, 
and improved by Wadsworth). For A-terms M: 


[M]= LU {[P]| P € A(M)}. 
The same theorem holds in Pw with the LI replaced by U. 


7.8. Corotiary. [MJj= U, [M*]. 


Proor. Let P € A(M), then PE M, P 2-nf. Now let all nodes in BT(P) 
have depth <k. Then [P]C[M*] and the result follows. O 


7.9. THEOREM. The following are equivalent for MEA: 
(i) M is solvable, 
(ii) [IM] 4 1, 
(iii) M has a hof. 


Proor. It may be assumed that M is closed (if not, consider AX.M and 
note Ad. i= 1). 

(i) > (ii) Let M be solvable. Then for some N, At MN = LIf[M]= 1, 
then [J] = [MN] = LIN]= 1, contradicting 1 d= 1. So [M]# L. 

(ti) > (ili) Suppose M has no hnf. Then A(M)= {2}, hence by 7.7, 
[M]= 1. 

(iii) > (i) Suppose M — AX.xiM,---M,. Then by giving M enough 
arguments N =Aa,-:-a,.I, M can be solved. (J 


7.10. CoROLLARY. D. is sensible; hence An + # is consistent. 


Proor. By 7.9, M is unsolvable >[M]= 1; so D.& #. Hence by 4.23, 
D.FAn+#. O 


Another consequence of the approximation theorem is a connection 


between the fixed point combinator and the least fixed point operator for 
complete lattices. 


7.11. THEOREM (Tarski). Let D be a complete lattice. Each continuous 
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f:d— D has a fixed point. Moreover there is a Y* € [[D > D]— D] such 
that Y*f is the minimal fixed point of f. 


Proor. Let Y*f = Liy(1)| n€w}. Then Y* is continuous by 4.8 and 
4.5. Since LE f(1), so f"(L)C Ff") {fr(1)| n € w} is directed, hence 
Y*f is a fixed point of f. If f(x) = x, then since 1 C x, f(L)C f(x) Lx, etc., 
so Y*fCx. O 


Let Yraski be the element of D. corresponding to the fixed point 
operator Y* and let Yeury =[Y], with Y defined after 2.1. 


7.12. THEOREM (Park). In Scott’s model D., Ytarski = Y curry. 


Proor. By 7.8 and 7.5, 
Ycund = (Li Df. ft (QI = Li, d* (1) = Vrontid 
by definition. The result now follows by extensionality. O 


Results analogous to 7.11 and 7.12 hold for Pw. 
In order to prove the characterization of equality in D., ihe following 
definition is needed. 


7.13. DEFINITION. (i) M,N iff M* CN‘; 

(ii) M<,N iff UM’, N’[An t+ M=M"', Ant N=N' and M'E,N’; 

(iit) M<N iff VkKM <,N. 

Note that if BT(M)~, BT(N), then M < N and N < M. The converse 
follows from 7.14 and 7.16. Also note M* < M. < is transitive, since if 
MC Nt and N53 L$, where An + N, = N2, then by some By-conversion 
M, becomes M; such that MSC NSE LS. 

7.14. THEoREM. M<N > [M]CINI. 

The proof is given in 7.25—7.27. 

7.15. Coro_tary. BT(M)~, BT(N) > [M] = [NI]. 
Proor. By the remark following 7.13. ( 


7.16. THEOREM (Hyland; Wadsworth). The following are equivalent for 
M,NEA. 
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(i) BT(M) ~, BT(N), 
(ii) DAE M=N, 
(iii) H*+M =N. 


ProoF. (i) > (ii) is 7.15. 
(ii) > (iii) by 5.9 since D.F # (7.10). 
(iii) > (i) is 6.8. O 


Note that Th(D.)C #* > Th(D.) = #* is not obvious (as would have 
been the case in ordinary model] theory): since the language of the 
A-calculus does not contain logical operators, Th(Q) is not necessarily a 
complete set of sentences. 


7.17. CoROLLARY. (i) Th(D.) = #*; hence for every D, Dz satisfies the 
same set of equations, 
(ii) D2 is algebraically simple. 


PRrooF. (i) Immediate. 
(ii) D2 = DM (Th(D2)) = M°(H*) hence the result follows from 5.11. O 


In order to prove the approximation theorem the following indexed 
AM-calculus was introduced by Hyland and Wadsworth. 


7.18. DEFINITION. The set of indexed (AM)-terms is defined by adding to 
the formation rules of the AQ-terms: if M is an indexed AQ-term, so is 
(M)’ for all p © w. The interpretation of AM-terms in D. is extended to 
indexed terms by adding the clause [(M)’], =({M],),. Thus the 
superscripts p are considered as the projections ®.,, : D.— D,. 

If M is an indexed term, M* is obtained from M by leaving out all 
superscripts. Note that [M]C[M*]. A completely indexed term M is such 
that each subterm occurrence N of M* has an index in M (i.e. occurs as 
part of (N)? in M). 


7.19. DEFINITION. The reduction relation on AQ-terms is extended to 
indexed AQ-terms by adding the axioms 
Ax. 2? =; 2’M> 2°; (Ax. M)?*'N->(M[x/N?]); 
(Ax. MN > (M[x/Q"))'; (M?y¥ > Mme 


andthe rule M— N > M?’ — N°. An indexed term M has a nf if for some 
N, M—>WN and N* is in Qnf. 
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7.20. LeEmmMa. Let M,N be indexed terms, and M-—- WN as indexed terms. 
Then 

(Gi) N*CM*. 

(ii) [M] = [IN]. 


Proor. (i) The approximation comes in at reductions like (Ax.x)"N—> 2°. 
(ii) By 4.19 and 4.15(i), (v). O 


7.21. DEFINITION. (i) An indexing I on M is a mapping that assigns to each 
subterm occurrence of M an index..M' is the resulting completely indexed 


term. 
(ii) r(M)={M'|I indexing on M}. 


7.22. Lemma. Let M be a A-term. Then [M] = LU {IN]|N € 7r(M)}. 
Proor. By induction on the structure of M, using 4.15(iii). O 
7,23. LemMA. Each completely indexed term has a nf. 


Proor. M has a p-redex iff (Ax. P)’Q occurs in M. The order of M is the 
maximal p such that M has a p-redex. By induction on the order p of M, it 
is shown that M has a nf. 

p=0: contractions like (Ax.P),Q—(P[x/Q"], Ax.Q’ >,” 
2°M > 0° and (M’)? > M™*""-® decrease the length of a term, hence 
each term of order 0) has a nf. 

p=n+l1: replacing the rightmost n+1 redex (Ax.P)"*'Q by 
(P{x/Q"])” and then replacing terms (Q")* by Q™""® results in a term 
with one less occurrence of a p-redex. (Prime example (some indices are 
omitted): 

(Aab. baa)"*'((Ax.x""'R)"*'(Az.z)"*?) > (Aab. baa)""", 


— (Aab. baa)"*'(((Az.z)")"*'R)— (Aab. baa)"*'"((Az.z)"R).) 


After a finite number of steps the term is reduced to one with order n 
and the induction hypothesis applies. OO 


7.24. Proor oF 7.7. [M] = LI{IN]| N & 7(M)} = LIQL]|4N € c(M)L of 
of N}JC LIL *}|3N €7(M) L of of N}CLUIIN]|N€ A(M)} CIM]. 

The five (in)equalities follow respectively from 7.22, 7.20(ii) and 7.23, 
(LJGIL*], L*€ A(M) since by 7.200) L*EN*=M, and from NE 
A(M)> INJEIM] (by 7.6). O 
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Now Hyland’s proof of 7.14 will be presented. 


7.25. LEMMA. Let P#2 be an Qnf and P<Q. Then BT(P), BT(Q) 
are top  mergeable, Ant P=Axi-*+Xn-XiPi-*+ Pm and AntQ= 
AX, °° X,-XiQi°°* Q, Say, and P,< Q:,..., Pm < Qn; the Pi,..., Pm are in 
Onf. 


Proor. Only 7-reductions affect the BOhm trees. Therefore, since P < Q, 
after some 7 -changes the tops of BT(P), BT(Q) are the same, i.e. they are 
top mergeable. The P,’s are either already part of P or a variable created 
by an n-expansion; therefore they are in Qnf. Since P <,.,Q, it follows 
that P, <, QO; for all k; hence P, <Q, O 


7.26. LEMMA. Let P be an Qnf. Then P< N > [PICINI]. 


Proor. By induction on the structure of P, 

Case 1. P=. Then we are done. 

Case 2. P=x. Then, using x <,;N, AF N=Ayics+ ya XNi°+: Ny. By 
7.22 it is sufficient to show that for any indexing I of x fx] CIN]. This is 
done by induction on k = I(x). 

If k =0, then [(x)J=[Ayi--- yn-(x)?2--- QYCIN] since in Dz xo= 
Xot L = Ay. Xo by 4.19(ii). 

If I(x)=k +1, then [(x)*"}=QAyies syne (x) (it nD) by 
4.19(i) (and 4.14(i)). Since x < N, for all i, l= isn, y; < N, hence by the 
induction hypothesis [y**!‘JLIN]. So we have [(x)**'JC(N]. 

Case 3. P=Ax,;+°+Xm.XP,-°*:P,, Pi,...,P, Anf's. Then since P< 
N for some P',N‘' with Ant P'’=P, AntN'=N one has P’'= 
Axi + Xp. xXPi +++ Ph, N'SAx- + xp. XNi-+> Ni and P| <Nj,...,PQ< Ni, 

By the induction hypothesis [P.JEQN],... hence [P]= 
([PJCINT=IN] O 


7.27. PROOF OF THEOREM 7.14. Suppose M < N. Then Vk M“ <N, hence 
by 7.26, Vk [M*]CIN]. Therefore [MJ=U[M*‘ICIN]. O 
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1. An extension of the Finite Ramsey Theorem 


In this chapter we present a recent discovery in mathematical logic. We 
investigate a reasonably natural theorem of finitary combinatorics, a simple 
extension of the Finite Ramsey Theorem. This chapter is mainly devoted to 
demonstrating that this theorem, while true, is not provable in Peano 
arithmetic. 

The first examples of strictly mathematical statements about natural 
numbers which are true but not provable in PA (Peano arithmetic) were 
due to the first author (see Paris [to appear]) and grew out of the work in 
Paris and Kirby (to appear]. The second author’s contribution was to show 
that Paris’s proof could be carried through with the particularly simple 
extension of the Finite Ramsey Theorem mentioned above (and stated 
precisely in 1.2). 

Since we are going to work extensively with the partition calculus, the 
reader would be wise to consult pages 390-393 of Chapter B.3 for basic 
information and pages 393-395 for a proof of the Infinite Ramsey 
Theorem. 


1.¥. Derinition. We call a finite set H of natural numbers relatively large 
if card(H) = min(H). Given natural numbers e,r,k and M, we use the 
notation 

M —> (k) 


to mean that for every partition P:[M]*—r there is a relatively large 
HCM< which is homogeneous for P and of cardinality at least k. 


1.2. THEOREM. For all natural numbers e,r and k there is an M such that 
M—> (k);. 


Without the * under the arrow which makes the homogeneous sets 
relatively large, this would just be the Finite Ramsey Theorem. The Finite 
Ramsey Theorem is provable in Peano Arithmetic. Our proof of 1.2 will 
use the Infinite Ramsey Theorem, and cannot be carried out in PA. 


1.3. Main THEOREM. The combinatorial principle of 1.2 is not provable in 
Peano Arithmetic. 


For the reader who is not used to working in PA, and so does not even 
see how to formulate 1.2 in PA, we would remark that PA is equivalent 
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(for statements about natural numbers) to the result of replacing the axiom 
of infinity by its negation in the usual axioms ZF of set theory (see Chapter 
B.1), and 1.2 can be formulated in this theory directly, without any coding. 


2. Proofs of 1.2 and 1.3 


We first prove 1.2. Fix e,r and k, and suppose there were no M of the 
desired kind. Call P a counterexample for M if P is a partition of [M]* into 
r pieces with no relatively large homogeneous set of size at least k. We may 
view the set of counterexamples as a finitely branching infinite tree. That is, 
if P and P’ are counterexamples for M and M’ respectively, we put P 
below P’ in our tree just in case M < M’ and P is the restriction of P’ to 
[M]°. By K6nig’s Lemma there is a P :[w]* — r such that for every M, the 
restriction of P to[M]° is a counterexample for M. By the Infinite Ramsey 
Theorem, there is an infinite H Cw homogeneous for P. But then by 
choosing M large enough (compared to k and min(H)) we see that HN M 
is, after all, a relatively large homogeneous set for P| [M]° of size at least 
k. O 


Looking ahead to Section 3, we point out that, for each e, the above 
proof can be formalized in PA. (The proof on pages 393-395 of w — (w); is 
naturally formalized, by induction on e, in restricted-(IIZ — CA), which is 
conservative over PA (see page 940).) Thus, for every e, 


PAEVWr,k 3M (M —> (k)}). 


We now begin the proof of 1.3, which will take up the remainder of this 
section. We define a certain theory JT in 2.1 and then show 
Con(T)— Con(PA) is provable in PA. The proof will be concluded by 
showing, in PA, that the combinatorial principle of 1.2 implies Con(T). 

For the purpose of the following we identify finite subsets of w with finite 
increasing sequences from w. The theory T is expressed in the language of 
PA plus infinitely many new constant symbols co, c,... . 


2.1. DEFINITION. The axioms of T are as follows: 
(i) The usual recursive defining equations for +, X,<, plus the 
induction axioms but only for limited formulas. 
(ii) For each i =0,1,..., the axiom (¢)’ < G41. 
(iii) For each finite subset i=i,,...,i, of , let c(f)=c,,...,¢,- 
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For each i < k,k’ and each limited formula w(y; z) (where k,k’ and z all 
have the same length) we have the axiom: 


Wy <ci [y(ys c(k)) <> (v5 c(k’))]- 
2.2. Proposition. Con(T) implies Con(PA). 


Proor. Let & T and let I be the initial segment of & of those a < c; for 
some i € w. By (ii), I is closed under + and x. It will be enough to show 
the following. 


2.3. Claim. S=(1,+, X, <) is a model of PA. 


For each formula @(y) from the language of PA, define a limited formula 
6*(y;z) as follows. Write 6 in prenex normal form, say 3x, -:-Wx,9(x; y) 
where ¢ is quantifier free. Then @*(y;2Z:,...,2,) is dx:<z1°--Wx,< 


2,9(x;y). 


2.4, Claim. Given i < k, a<c;,, and 6(y), where k,a and y are all of the 
appropriate length, 


Se O(a) if and only if Wk O*(a;c(k)). 


Notice that 2.3 is an immediate consequence of 2.4 since part (i) of 2.1 
guarantees that for all @, & will satisfy induction for 6*. Then proof of 2.4 
proceeds by induction on @. Suppose @(y) is 3x p(x, y). Thus 0*(y; z) is 
3x <z,*(x,y;22,...,2,). So IF O(a) iff for some b in I and some j 
(where min(j) is large) WF y*(b, a; c(j)), which happens iff for some k’ 
(again, where min(k’) is large) 2 & @*(a; c(k’)) which, by 2.1(iii), is the case 
iff WE e@*(a;c(k)). O 


The attentive reader should observe that the proof of 2.2 can be 
formalized in PA (in a way similar to Section 6 of Chapter D.1). Also, one 
should notice that for the purposes of the above proof, we can weaken 
2.1(iii) to those limited formulas w(y;z) of the form @*(y;z) for some 


4(y). 
2.5. PROPOSITION. The combinatorial principal of 1.2 implies Con(T). 
By Gédel’s Second Incompleteness Theorem (see page 825), 2.5 and 2.2 


yield our main theorem, provided of course that 2.5, like 2.2, is proved in 
PA. 
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Before beginning the proof of 2.5, we-point out a few facts about 
partitions. 


2.6. LEMMA. Given partitions P, and P, of [MJ° into ro and r, pieces, there is 
a partition P of [M]° into ror, pieces such that for H C M, H is homogene- 
ous for P iff H is homogeneous for both Py and P;. 


Proor. Let P(a)=(P.(a), Pi(a)). O 


2.7. LEMMA. A set H C M is homogeneous for a partition P of [M\° iff every 
subset of H of size e +1 is homogeneous for P. 


Proor. Let a = a,..., a, be the first e elements of H. Pick b = b,,...,b. 
so that P(a) # P(b) and so that b, + ---+ b, is minimized. If i is the least 
index such that a,# b,, then {a,,..., a, b,,...., be} is not homogeneous and 
of sizee+1. DO 


We define \/r to be the first natural number s such that s?=r. Notice 
that for most r (i.e., for r=7), r=1+2V/r. 


2.8. LEMMA. Given P :{[M]° > r there isa P':{M]**'—> (1+ 2,/r) such that 
for all HCM of cardinaltiy >e +1, H is homogeneous for P iff H is 
homogeneous for P’. 


Proor. Let s = \/r. Define functions Q (for quotient) and R (for remain- 
der) both mapping [M]° into s by the equation P(a)=s-Q(a)+ R(a). 
For b = by,..., be, bes: in [M]**', let b’ = b,,...,b.. We now define our 
desired P’ on [M]**" by: 


0 if ‘b is homogeneous for P, 
P'(b) = (0, R(b')) it b is homogeneous for Q but not for P, 

(1, Q(b’)) otherwise, 
Let H be homogeneous for P’' of cardinality > e + 1, and let ¢ be the first 
e +1 members of H. We need to see that P’(e) =0 to verify that H is 
homogeneous for P, by 2.7. Note that for each a in [e]° there is a b in 
[H]°*’ such that b’= a. Suppose P’(ce)=(1,i). Then, by the previous 
remark, Q(a)=i for all a in [c]* so that ¢ is homogeneous for Q, 
contradicting the definition of P’. So suppose P'(c) = (0, j) so that ¢ is Q 
homogeneous, say Q(a) = i for alla in[e]*. But then P(a)=s-i+j forall 
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such a so that ¢ is homogeneous for P, again contradicting the definition of 
P’. O 


2.9. LEMMA. Suppose we are given n partitions P,:[M]*— 1, alli <n. Let 
e = max,e, and r = TI, max(r,,7). There is a partition P :{M]* > r such that 
for all HCM of cardinality >e, H is homogeneous for P iff H is 
homogeneous for all the P;. 


) 
Proor. Combine 2.6, 2.8 and the remark preceeding 2.8. DO 


We now state a combinatorial principle which is tailored to imply 
Con(T). Parts (ii) and (iii) of 2.10 correspond to the similar parts of 2.1. 
There is no 2.10(i). After showing that 2.10 implies Con(T), we will return 
to derive 2.10 from 1.2. 


2.10. PRoposiTiOon. For all e,k,r there is an M such that for any family 
(P,; € <2™) of partitions P; :{M]* — r, there is an X of cardinality =k such 
that: 

(ii) if a,b € X and a <b, then a’<b, 

(iii) if aE X and E <2°, then X ~ (a + 1) is homogeneous for P,. 


2.11. CLAM. 2.10 implies Con(T). 


Proor. Given a finite subset S of T, let co,...,cx-1 be all constants 
appearing in S. We will use 2.10 to show that S has a model of the form 
(wm; +, X,<,Xo,...,Xk-1), Where Xo,...,Xx-1 are the first k elements of 
the set X given by 2.10. This model clearly satisfies (i) of 2.1 so we need 
only worry about those axioms of the forms (ii) and (iii) in S. Part (ii) of 2.10 
takes care of the axioms of form (ii) automatically, so we only need to set 
up our partitions to handle those of form (iii). 

We may view each é € w as coding a finite increasing sequence a(€) 
from w in such a way that all sequences from b are coded. by some € < 2”. 
Given a limited formula w(y; z) and a sequence a(é) of the same length as 

_y, we obtain a partition F,,; :[M]* — 2, where e’ is the length of z defined 
by F,,<(c)=0 if #(a(é);¢), and = 1 otherwise. 

Consider, for the moment, fixed M and &. For each axiom of type (iii) 
occurring in S there is a corresponding limited formula w(y; z) and hence a 
corresponding partition F,,,. By 2.9 we may combine these into a single 
partition P,:[M]° —r, where e and r depend only on S, not on € or M. 
Now using 2.10, choose M so large that (ii) and (iii) hold for some X C M 
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for the family (P,;€ <2”), and card(X) = k + e. Now choosing Xo, ..., Xx-1 
as described above, we see that all axioms of type (iii) are satisfied. O 


Our attentive reader will have noticed that since (w; +, X,<) has a 
primitive recursive satisfaction relation for limited formulas, we can prove 
in PA (or even PRA) that this structure satisfies (i) of 2.1. Hence the above 
proof can be carried out in PA. 

All that remains is for us to prove (in PA) that 1.2 does imply 2.10. To do 
this we need a method for obtaining homogeneous sets which grow fast. 
We are indebted to F. Abramson for some of the following arguments 
which have simplified our original proof. 

For any function g, let g® be g composed with itself x times. Let 
fo(x) =x +2 and let f,.:(x) = (fr)(2). The reader can check that fi(x)= 


2x, f(x) =2*, f(x) =3,, where 3, = 2” , a stack of x 2’s, and so on for 
fia, fs,.... Readers familiar with the Ackermann function will realize that 
each f,, is primitive recursive and that every primitive recursive function is 
eventually dominated by some f,, but these facts will not be used below. 


2.12. LEMMA. For every p there is a Q:[M]'~>p+1 such that if X is 
homogeneous for Q and of cardinality at least 2, then min(X) = p. 


Proor. Let Q(a)=min(a,p). O 


We now come to two lemmas which use relatively large homogeneous 
sets. 


2.13. Lemma. For each m there is a partition R :{M]’— r (where r depends 
only on m) such that if X C Mis relatively large and homogeneous for R and 
of cardinality >2, then for every x, y © X, x <y implies f,,(x)< y. 


Proor. For each ism, let P,(a,b)=0 if f,(a)<b; =1 otherwise. Let 
p = fn (3) and let Q be as in 2.12 for this p. Use 2.9 to combine all of these 
into R:{M|’—r. Let X be relatively large and homogeneous for R. Let 
a =min(X), b = max(X). By induction on i = m, it is easy to show first 
that f,(a) < b (this is where you use card(X) = a) and second that f(x) < y 
for all x,y in X, x < y, by homogeneity. () 


2.14, Lemma. Let P:[M]‘’—s (e€ 22) and m be given. There is a 
P*:[M]* > s', where s' depends only on m,e and s, such that if there is a 
relatively large Y C M homogeneous for P* of cardinality > e, then there is 
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an X CM such that X is homogeneous for P and card(X) is at leaste +1 
and fn (min(X)). 


Proor. Let h(a) be the largest x such that f,, (x) = a. For a = a,,..., a, let 
h, = h(a,),...,h(a.). Let S(a)= P(h.) if he is an e-tuple (ie., if 
h(a.)< h(a.)<---<h(a.)); S(a)=s otherwise. Thus S:[M]*>s+1. 
Let R be as in 2.13. Use 2.9 to combine R, S into P*:{M]* > s’. Let Y be 
given as in the statement of our lemma, and let X be the image of Y under 
h. The partition R promises us that h is one-one on Y so that card(X) = 
card(Y)=min(Y). But the definition of h implies that f,, (min(X))= 
min(Y) so card(X) = f,.(min(X)) as desired. O 


2.15. PRoposiTION. The combinatorial principle of 1.2 implies that of 2.10. 


Proor. We are given e, k and r, and must produce an M as in 2.10. Finda p 
so that for all a = p, f(a) is reasonably big as compared with e,r, k and a. 
We will make this precise in the last paragraph of the proof; for now, just 
note that f;(y)=3,. Let e'=2e +1. 

Now given any M and any family P, :[M]‘ > r for € <2™, define a new 
S:[M]° > 2 by: S(a, b,c) = 0 if P,(b) = P,(c) for all E< 2°; S(a,b,c)=1 
otherwise. Let Q be as in 2.12 and R as in 2.13 for m =2. Use 2.9 to 
combine Q, R and S into a single P and then use 2.14 to obtain 
P*:[M]*— s'. The number s’ depends only on e'= 2e + 1 and on p. We 
now apply the combinatorial principle 1.2. Find an M_ such that 
M— > (e'+ 1)$. By 2.12 there is an X C M which is homogeneous for 
Q,R and S with card(X)= f;(min(X)). Since X is homogeneous for Q, 
min(X) = p. Since X is homogeneous for R, and since f(y) = y’ for those y 
big enough to be in X, X satisfies 2.10(ii). 

To verify 2.10(iii), we replace X by X'= X ~ d, where d = d,,...,d, are 
the last e elements of X. Let i, = P.(d). If we show that for all a <b, < 
>>> <b, in X’ and all € < 2°, P.(b) = i,, we will have shown that X’ satisfies 
2.10. To show this it suffices to show that S(a, b, c) = 0 for some (hence, by 
homogeneity, for every) 1+ 2e tuple a,b,c from X. Let a = min(X) and 
consider consecutive e-tuples from X ~ (a +1). Our choice of p earlier 
should be such that there are more than r®” such e-tuples for then we can 
find e-tuples b,¢ such that P,(b)= P,(¢) for all € < 2%, as desired. O 
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3. Refinements 


In the proof of our main theorem we relied on various proof-theoretic 
results, in particular, on Gédel’s Second Incompleteness Theorem. It is 
possible, however, to prove our main theorem using only model-theoretic 
methods. This is the approach taken in Paris [to appear], where a general 
model-theoretic methodology (called indicator functions) for producing 
such results is developed. 

On the other hand, 1.2 is actually equivalent, in PA, to a well-known 
proof-theoretic principle, and our proof has the advantage of making this 
fairly obvious. Recall, from page 849, the definition of RFNs,, the 
statement of number theory expressing the statment ‘‘For all 2, sentences 
w, if PAF, then y’’. 


3.1. THEOREM. It is a theorem of PA that RFNs, is equivalent to the 
combinatorial principle of 1.2. 


Proor. After the proof of 1.2 we mentioned that 
for all e,r,k, PAbAM (M —> (k))). 


This fact, which we indicated how one would verify, is itself a theorem of 
PA. An application of RFNs, gives 1.2. 

Assume 1.2 and let us prove RFNs,. Let & be a 2; sentence. We prove 
that if 4, then Con(PA + Ww). The proof of 2.5 shows that if w is false in 
w, then Con(T+—w), using 1.2. But the proof of 2.2 shows that 
Con(T + —y) implies Con(PA+ ay). O 


For our final result, define a recursive function f by 


f(e) =the least M such that M ——> (e + 1). 


3.2. THEOREM. If g is a (description of a) recursive function, and if PA} “‘g 
is total’’, then for all sufficiently large e, f(e)> g(e). 


Proor. Let S be a finite subset of T and let co,...,c,-, be the constants 
appearing in S. As the proof of 2.5 (in particular that of 2.11) shows, we 
may interpret these constants so as to make w a model of S. By examining 
that proof, one can see that for all large enough e, we can in fact interpret 
Co,.--,Ck-1 using members of the interval (e, f(e)). If g(e}=f(e) for 
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infinitely many e, the above would show the consistency of T plus the 
following axioms in a new constant e: 


e€ <0; “3x <=c (g(e)=x) forallisa. 


By the proof of 2.2 we obtain the consistency of PA+ Je (g(e) is not 
defined). O 


We wish to thank the editor for almost forcing us to write this chapter, 
for typing it himself, and for a number of minor changes, provided he 
accepts responsibility for all misprints. 
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Archimedian Axiom 12, 202 
Arithmetic Universe 307 
Arithmetical Comprehension Scheme 937, 
938 
Arithmetical Hierarchy 556f 
Arithmetical Relations and Sets 
Aronszajn Tree 384 
— and Martin’s Axiom 499 
Artin’s Conjecture 132, 147, 150, 151 
Assignment to Variables 20 
Atomic Formula 18 
Atoms 361 
see also Urelement 
Autonomous Progressions of Theories 949, 
968 
Axiom of Choice 335f, 347, 454, 463 
Dependent Choice 358 
Intuitionistic Choice 986, 987 
Multiple Choice 365 
Axiomatic Recursion Theory 669 
Axiomatizable 22, SO, 112 
Axioms and Rules for First-Order Logic 
34, 35, 37, 39 
Axioms for a Theory 50 
Axioms for L (Open) 101 
Axioms for Live 99 
Axioms for Set Theory 324 


555f, 802 


* Entries begining with a Greek letter are at the end. 
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Back and Forth Construction 71 
Baire Property 504, 798 
— and Martin’s Axiom 492, 497 
Baire Space 785 
Banach-Tarski Paradox 351 
Bar Induction 942, 954, 956, 1010 
Bar-recursion 1031 
Bar Theorem 1010 
Barwise Compactness Theorem 
see =-Compactness Theorem for 
Admissible Fragments 
Barwise Completeness Theorem 236 
—, Abstract Form 262 
Barwise Interpolation Theorem 273 
Basic Formula 74 
Basic Omitting Types Theorem 93 
Basic Stone Space Over Vt 84 
Basic Type 74 
Basic Type Over Pi 82 
Basically «-Stable 86 
Basically Saturated 74 
Basic Stability Function 87 
Basic Stability Spectrum 87 
Basis 803 
Beth’s Theorem 274 
BHK-explanation 977, 982 
Binary Expansion 998 
Binumerate 838 
Blocking Requirements 615 
see also Priority Method 
Bohm Trees 1120 
Bohm’s Theorem 1106 
Bold Face 2, NM, A 240 
Bolzano-Weierstrasz Theorem 1042 
Boolean Algebras 54, 89 
Boolean Extension 174 
Boolean Power 163 
Boolos-Putnam Theorem 644, 649 
Borel Hierarchy 788 
Borel Set 750, 787 
Box (CJ) Principle of Jensen 
see Square Principle of Jensen 
Bichi’s Theorem 615 
Busy Beaver Problem 533 


Canonical Structure 31 

Cantor-Bendixson Construction 749 

Cantor—Bendixson Derivative 84 

Cantor-Bendixson Rank 85 

Cantor-Bendixson Theorem is False 
in HYP 955 


Cardinal 336f 
—, Regular 372 
—, Singular 372 
see also Large Cardinals 
Cardinality of a Model 63 
Cardinality Spectrum 88 
Cartesian Category/Theory 290, 1056 
Categorical Theory 66, 607 
—, & 163 
—,N, 164 
Category of Definable Types and Definable 
Total Functions 1065 
Category of Models 114 
Category of T-algebras (Models) 285 
Cauchy’s Principle 211 
CCC (Countable Antichain Condition) 
— Partial Orders 492 
— Spaces 390, 492, 493, 505 
see also Antichain Conditions 
CH 
see Continuum Hypothesis 
Chain Conditions 
see CCC, Antichain Conditions 
Chain of Models 55 
Chang-kos-Suszko Theorem 63, 156 
Character 718 
Characteristic Function 302 
Characteristic of a Field 15 
Characterization Theorem (for 
Realizability) 989 
Characterization Theorem (for the Dialec- 
tica Interpretation) 1035 
Choice Function 335, 347 
Choice Sequences 1005 
—, Elimination of 1012f 
Chromatic Number 486 
Church-Kleene w, 772 
see also Recursive Ordinals 
Church’s Thesis, Extended 989 
Church-Rosser Theorem 1102 
Circular Reasoning 
see Vicious Circle Principle 
Classes in Set Theory 338f 
Classifying Topos 301 
Closed Unbounded Set 372f, 407 
Closure and_ Distributivity Conditions 
424f, 429, 434f, 443 
Coalgebras 288 
Code 826, 830f, 835f 
Code (of a Function) 702 
see also Index 
Coding of Tuples 536, 693, 702 
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Coding Scheme 755 
Cohen Extension 414 
Cohen PO Set 420 
Coherent Theory 296 
Collapsing Cardinals 422, 441 
Collapsing PO Set of Lévy 441 
Collection Principles 
—, Ay 239 
—, - 240 
Coloring Infinite Maps 28 
Combinatorial Principles 423 
— in L 474, 479, 483 
see also Diamond, E-principle, Kurepa 
Hypothesis, Square, W-principle 
Combinatory Algebra 1094 
Compactness Theorem 10, 59, 118, 356 
—, Proofs of 33, 59, 118 
— for Propositional Logic 26 
=- — for Admissible Fragments 251, 
716 
Companion Operator 156f 
Companion of a Spector Class 778 
Compatibility 412 
Complete Formula 79 
Complete R.E. Set 543, 553 
Complete Theory 16, 50 
Completeness Test 66 
Completeness 844 
—, Demonstrable =, 844 
—, Semantic 854 
—, Syntactic 854 


—, w- 854 
Completeness (of Intuitionistic 
Logic) 1025f 


Completeness (of the Reals) 999 
Completeness Theorem 22f, 844 

—, Barwise 236 

—, Extended 265 

— for Categories 296 

—forL.. 99 

— for L(Q) 44 

—, Hilbert-Bernays 860 

—, Proofs of 36f 

—, Significance of 22f 

—, Statements of 35, 39 
Complexity 539, 899f 
Complexity Class 539 
Complexity of Computation 539 
Complexity of Decision Procedures 
Computability 

see Effective Computability 
Computation Record 536, 540 


621f 
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Computational Complexity 539 

see also Complexity 
Computers, Digital 536 
Cony 828 

k-Con, 853 

w-Con? 853 

w-Con; 853 

1-Con, 852 
Concentrates 691 
Conceptual Completeness Theorem 297 
Condensation Lemma 465 

see also Mostowski Collapsing Lemma 
Condition for T 89 
Conjunctive Game Formula 254 
Consequence 10, 51 
Conservation Theorem 858 
Conservative Extension 56, 932 
Consistency Property 

see Hintikka Set 
Consistent Theory 16, 50 
Constructibility 318, 359, 455, 642f, 649, 

654, 667f 
Constructibility, Axiom of 426, 428, 465 
Constructivism 974 
Constructivism, Naive 974 
Constructivization, Global 1040 
Constructivization, Local 1040 
Context-free Grammars 583f 
Continuity Schema 1006 
Continuous 1000 


Continuous, Uniformly 1000 
Continuous Functionals, Extensional 1029 
Continuous Functionals, Intensional 1028 


Continuum Hypothesis 318, 344, 376, 407, 
420f, 424f, 454, 465, 635 
— and Martin’s Axiom 493, 494 
Contradiction, Law of 25 
Co-recursively Enumerable 
Corona Problem 207 
Countable Functionals 1029 
Countably Compact Space 508 
Course-of-Values 540 
CRA 974, 990 
Craig Interpolation Theorem 72 
see also Interpolation Theorem 
Creative Set 553 
Critical Functions of Ordinals 945 


OA7E 


Cross-section 151, 152 
CTM 

see Transitive Sets and Models 
C.U.B. 


see Closed Unbounded Set 
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Cut-elimination Theorem 40, 875, 882, 
888, 898f 
— for Infinitely Long Formulas 957 
Cut Rank 873 
Cut Rule 39, 872 


D1, D2, D3 
see Derivability Conditions 
Decidable Model 16 
Decidable Relation 535 
see also Recursive Relation 
Decidable Theory 16, 52, S596f 
Decidability in Topology 618 
Ded(x) 87 
Dedekind Set 362 
Definability Lemma 416 
Definition (in a Structure) 555 
Degree Hierarchies 669 
Degrees, Automorphisms of 638, 641, 646 
Degrees, Homomorphisms of 636f, 641 
Degrees, Isomorphisms of 638, 641, 645 
Degrees, Kinds of 
a-degrees 633, 660 
Constructibility 642 
Cuppable 638f 
Exact Pairs of 637f 
High 646 
Kleene 728 
Low 646 
Minimal 636, 638, 640, 647 
Minimal a_ 661 
R.E. 551f, 634, 640, 645f, 648f, 650 
Splitting 638 
Turing 
see Degrees of Unsolvability 
Degrees of Theories 647f 
Degrees of Unsolvability 550f, 631f 
Degrees, Sets of 
Determined Sets of 642f 
First-order Definable Sets of 640f 
Ideals of 636f, 643, 644, 647 
Independent Sets of 635f 
Initial Segments 
see Ideals of Degrees 
Semilattices 634f, 645f 
Degrees, Theory of 639, 640f, 646, 647 
Delta System 389 
de Morgan’s Laws 25 
Dense Sets 412 
Density Theorem 645, 661 
Dependent Choice, Axiom of 358 


Derivability Conditions 827, 839f 
Derivation in Gentzen System 37 
Derivative 208, 212 
Determinacy 642f, 808 
—, Axiom of 369 
—, Borel 643, 651 
—, Projective (PD) 642f, 808 
Diagonalization Lemma 827 
Diagram 57 
Dialectica Interpretation 1032f 
Diamond (©) 318, 378, 510, 517 
—, Consistency of 438 
Differential 221 
Differential Closure 167 
Diophantine Equations, Number of 
Solutions of 586f 
Diophantine Relations 585f, 588f 
Direct Limit 142, 143 
Direct Product of PO Sets 438 
Directness 899f, 904f 
Discrete Orderings 603, 604, 610 
Distributivity Conditions 
see Closure and Distributivity 
Conditions 
Doctrine 284, 293 
Dominating Families 496, 497 
Double Negation, Law of 25 
Dowker Space 510 
Downward Léwenheim-Skolem 
Theorem 64 


EC 
see Elementary Class 
EC, Class 50 
see also Axiomatizable 
ECF 1028 : 
EED 293 
E.-M. 
see Ehrenfeucht-Mostowski 
E-Principle of Jensen 474, 517 
—, Consistency of 435f 
E(T) 
see Category of Definable Types and 
Definable Total Functions 
Eastern Model Theory 48 
Effective Computability 525, 528, 530 
see also Recursive Function 
Effective Procedure 528f 
Effectively Enumerable Relation 541 
see also Recursively Enumerable 
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Effectively Inseparable Sets 842 Existence Theorem for Recursively 
Ehrenfeucht~Mostowski (E.-M.) ' Saturated Models 70 
— Set 184 Existentially Closed Model 92, 121, 157f 
— Theorems 182 Existentially Closed Number Theories 172. 
Elementarily Equivalent 22, 50, 641 Existentially Closed Skew Fields 171, 172 
Elementary Analysis (Intuitionistic) 982 Existentially Complete 
Elementary Chain 55 see Existentially Closed Complete 
Elementary Chain Theorem 56 Existentially Universal (for a Theory) 162 
Elementary Class 22, 50, 112 Expansion of a Structure 30 
see also Finitely Axiomatizable Extension 305 
Elementary Diagram 58 Extension, Recursive 887 
Elementary Embedding 53, 143, 186 Extensional Structure 246 
Elementary Extension 53 Extensionality Axiom 325 
Elementary Formula 253 Extensionality in Finite Types 931 
Elementary in the Wider Sense —, Elimination of 934 
see Axiomatizable and EC, class External Set 204 


Elementary Map 
see Elementary Embedding 


Elementary Monomorphism Fan Theorem 1012, 1017 
see Elementary Embedding Field 15, 69, 72f, 92f, 119f 
Elementary Relation 253 —, Algebraically Closed 54, 77f, 92, 93, 
Elementary Substructure 53, 185 146, 147, 608 
Elementary Topos 302 —, © 150, 151 
Elimination Mapping 1015 —, Differential 84, 87 
Elimination of Quantifiers 55, 146, 147, —, Differentially Closed 164f 
152, 155, 600f —, Finite 153 
Elimination Theorem, First 1016 —, Laurent Series 150f 
Elimination Theorem, Second 1016 —, Non-Archimedean 12, 199 
Embedding Theorem 363 —, Ordered 54, 64f, 74f, 94 
Empty Theory 51 —, P-adic 149f, 611 
Encoding —, Pseudofinite 120, 611 
see Coding —, Real Closed 130, 147f, 609 
End-extension 247 —, Separably Closed 153 
Entity 200 —, Skew 171, 172 
Enumeration Property 701, 730 —, Valued 133, 149f 
Enumeration Theorem 704, 715 Field Object 306 
see also Normal Form Theorem Filter 106 
Enumerative Systems 950 —, Cofinite 107 
Envelope 728 —, Principal 107 
Equality Axioms 29 Fine Structure of L 476, 644 
Equalizer 1057 *-finite (Star-finite) 213 
Eguation Calculus 656, 899f Finite Character 61 
Equinumerous 336 Finite Hyperreal 
Equivalent Theories 50 — Number 202 
Erdés-Rado Theorem 183, 392 — Vector 215 
Essential Unboundedness Theorem 850 Finite Intersection Property 107 
Euclidean Geometry, Decidability of 596 Finite Limits 1058 
Excluded Middle, Law of 25 Finite Products 1055 
Existence and Minimality Lemma 414 Finitely Axiomatizable 22, 50, 119, 647f 
Existence Theorem for Basically Saturated Finitely Generated Class of 
Models 75, 76 Functionals 703 


Existence Theorem for Prime Models 80 Finitely Generated Model 52 
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Finitely Satisfiable 58 
Finite Type Structures 919 
—, Maximal 923 
—, Minimal 924 
Finite Types 1026 
Finitism 974, 978 
FIP 
see Finite Intersection Property 


First-Order Arithmetic (Intuitionistic) 982 


see also Peano Arithmetic 
First Product Theorem 438f 
Fixed Point 683, 684 
Fixed Point Theorems 856f 
—, Brouwer’s 1002, 1044 
— of A-calculus 1101 


—, Primitive Recursion 687 

—, Signature of 683 

—, Substitution 686, 688 
Fundamental Theorem of Calculus 214 
Fundamental Theorem of 

Ultraproducts 112 


Galaxy 204 
Gale-Stewart Games 642 
Game Form Theorem 256 
Games 748, 769 
see also Conjunctive Game Formula, 
Determinacy, Gale-Stewart Theorem 
Gandy-Kreisel-Tait Theorem 265 


see also Recursion Theorem and GCH 

de Jongh’s Fix-point Theorem see Generalized Continuum Hypothesis 
Forcing Base 98 Geometric Morphism/Functor 300 
Forcing (Cohen) 300, 318, 360, 414f, 641, Generalization Rule 32 


699 Generalized Continuity 1019 
Forcing Conditions 415 Generalized Continuum Hypothesis 407, 
Forcing (Finite Robinson) 90, 159, 160, 454, 465 


170, 171 —, Consistency of 426 
Forcing (Infinite Robinson) 159, 161, 170, Generalized Finite 655, 657, 658 
171 see also A-finite 
Forcing in Infinitary Logic 98 Generalized Suslin Hypothesis 473 
Forcing Product of PO Sets 439f Generated Substructure 52, 184 
Forgetful Functor 115 Generic Circle 292 
Formally Intuitionistic Systems, Translation Generic Extension 


into 962 see Cohen Extension 
Formula 19, 405 Generic Model Theorem 91, 99 
—, Atomic 18 Generic Sets 90, 99, 360, 413 


Generic Structure 288 
Gentzen’s Theorem 


—, Conjunctive Game 254 
—, First-order 19 


—, Minor 872 see Cut-elimination Theorem 

—, Principal 872 Geometric Morphism 300 

—, Side 872 Gédel Completeness Theorem 32 

—, Vaught 254 Gédel’s Functional (or Dialectica) 

—, Ag 238, 409f Interpretation 963, 1035 

—, M 239 Gédei Incompleteness Theorem 560, 599 
—, 2 239 Gédel Numbering 547, 633, 647 


Good Ultrafilter 134 
Graphs 486 
Grothendieck Topos 298 
Groups 8, 62, 76, 94f 


Fragment 250 

Free Algebra 287 

Free Variable 20 

Friedberg Jump Theorem 638, 639 


Friedberg—Muchnik Theorem 645, 661 —, Abelian 8, 611 
see also Post’s Problem —, Divisible 8 
Functional 683 —, Existentially Closed 168f 
—, Evaluation 685 —, Free 54, 97 
—, Minimalization 686 —, Galois 72 
—, Normal 697, 714 —, Linear 121 


—, Operative 687 —, Matrix 62 
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—, Orderable 62 

—, Solvable 62 

—, Torsion 9, 94, 97 

—, Torsion-free 9 

—, Universal Locally Finite 

—, Z- 151 

see also Whitehead’s Problem 
Groups and Unsolvable Problems 


190 


578f 


H(k), Huo(k) 244 
Hahn-Banach Theorem 354 
Halting Problem 538, 569 
Hanf’s Theorem 648, 650 
Handbook, Cost of 
see Large Cardinals 
Henkin Axioms 30 
Henkin Construction 31 
Hensel’s Lemma_ 149, 150 
Herbrand Normal Form 889 
Herbrand’s Theorem 876, 898f 
Hereditarily Continuous Functional 
HEO 
see Hereditarily Effective Operations 
Hereditarily Effective Operations 1028 
Hereditarily Finite Sets 43 
Hereditarily Hyperarithmetic 
Operations 952 


951 


Hereditarily Recursive Operations 951, 
1027 
HFD Set 515 


Hierarchy Theorem 558f 
Higher-order Language 7, 43, 1060 
Higher Type Objects 

see Object of Higher Type 
Hilbert’s Basis Theorem 126 
Hilbert’s Program 822f, 868 
Hilbert-Bernays Completeness 

Theorem 860 
Hilbert’s Nullstellensatz 

see also Nullstellensatz 
Hilbert’s 17th Problem 
Hilbert’s Program 598 
Hilbert’s Tenth Problem 568, S584f 
Hilbert’s Thesis 41, 587 
Hintikka Set 250 
Homeomorphy Problem 579f 
Homogeneous Linear Ordering 65 
Homogeneous Model 75 
Homogeneous-universal Structures 141, 161 
Homogenous Set 183, 193 
Homomorphic Image 72 


125, 146 


148, 149 
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Homomorphism 72, 108 

— of A-algebra 1100 
Hopf Algebra 288, 308 
HRO 

see Hereditarily Recursive Operations 
Hyperarithmetical Hierarchy 559, 753 
Hyperelementary 767 
HYP(a), HYP(W) 245, 246 
Hyperimmune 651 


Hyperinteger 202 

Hyperjump Hierarchy Comprehension 
Schemes 947 

Hyperreal Numbers 200, 202 


ICF 1028 
Ideal 13, 355 
—, Prime 14 
—, Maximal 14 
Ideal Statements 823 
Image (Homomorphic) 72 


Immediate Extension 150 
imp 826, 836 
Inaccessible Cardinal 343, 396 


Inclusion 310 
Incompleteness Theorem 

—, First 825, 827, 845, 861 

—, Formatized First 852 

—, Second 825, 828, 846, 862 

—, Formalized Second 828 
Independent Theory 88 
Index (of a Function) 538, 702 

see also Code 
Index (of the Handbook) 

see Self-reference 
Index Transfer Method 724, 726, 728 
Indicator Function 1141 
Induction 739f 

—, Positive Existential 761f 

—, Positive Elementary 764 

—, Non-monotone 774f 
Induction Completeness Theorem 690 
Induction 

on TC 243 

on Sets 334 

on Ordinals 331f 

—, Transfinite 878, 942, 1011 

—, on Unsecured Sequences 
Inductive Definitions 526 

—, Dual of 749 

—, Theories of 968 
Inductive Theory 55 


1009 
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Ineffable Cardinal 
Infinite 
Hyperreal Number 202 
Integer 202 
Infinite Derivation 880 
—, Code for 886 
Infinite Telescope 204 
Infinitesimally Close 202 
Infinitary Logic 43, 97f, 244f 
Infinitesimal 
Number 12, 198, 202 
Vector 215 
Infinitesimal Analysis 197f 
Infinitesimal Calculus 207, 208f 
Infinitesimal Microscope 203 
Infinite Terms 959 
—, Normalization of 960 
Infinity Axiom 326 
Instantaneous Description 536 
Integral 213 
Integral Definite Functions 152 
Iteration Theorem 
see Parameter Theorem 
Intermediate Value Theorem 
Internal Quantification 304 
Internal Set 199, 204, 265 
Interpolation Theorem 72, 272 
see also Robinson Consistency Theorem, 
Suslin Separation Theorem 
Interpretation of A-terms 1098 
Interpretation 
of a Language in a Topos 1076 
of One Theory in Another 598, 612f 
of Terms of Forcing Language 413 
Intuitionism 974 
Inversion Lemma_ 873, 881, 887 
Isolated Type 79, 82 
Isomorphic Embedding 53 
Isomorphism 53, 109 
Iteration of Forcing 
see Products of PO Sets and Martin’s 
Axiom 


399, 488 


1001, 1042f 


J 
see L and Oracle 


Jensenlehre 

see Fine Structure of L 
Joint Embedding Property 73, 160, 161 
de Jongh’s Fix-point Theorem 856f 
Jénsson Theory 73 
Jump 

—, Iterating 634, 644, 636 
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638f 
S551, 558, 634, 638f, 645, 


— Theorems 
— Operator 
670 

Jump Hierarchy Comprehension 
Schemes 947 


k-consistency 853 
Kaiser Hull 157 
Kent’s Theorem 855 
Kernel 361 
KH 
see Kurepa Hypothesis 
Kleene Class of Functionals 709 
Kleene Recursion in Type 2 Object 692 
Kleene Recursive (Higher Types) 711 
Kleene Recursive in 728 
KPU 239 
Kripke-Joyal Semantics 300 
Kripke-Platek Axioms 239 
Kurepa Hypothesis, Consistency of 428f, 
437f 
—, Independence of 442 
Kurepa Tree 428, 488, 521 


L 
see Constructibility 
L-space 517 
L-structures 
see Model 
La 244 
L...-equivalent 97 
Lambda Calculus 534 
Lambda Definability 534f 
Large Cardinals 342, 396f 
see also Ineffable Cardinal, Mahlo Car- 
dinal, Weakly Compact Cardinals, Inac- 
cessible Cardinal, Measurable Cardinal 
Lattice of a-Recursively Enumerable 
Sets 662 
Lattices in Degree Theory 634f, 645f 
Lawless Sequences 1019f 
Lawlike Operations 1027 
Least Upper Bound 1001 
Leibniz’ Principle 198, 200 
Lévy Absoluteness Theorem 245 
Light Face 2, Il, A 240 
Limit Recursive 634 
Lindenbaum Doctrine 294 
Lindstr6m’s Theorem 45 
Linear Order 64f, 75, 82f 
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Linear Order, Second-order Theory of 
617 
Linearly Ordered Sets, Decidability of 612 
Léb’s Theorem 845, 848 
—, Formalized 855, 858 
Local System 120 
Localization 298 
Logic 
—, Effectiveness in 546 
—, First-Order 6f, 587, 588 
—, Infinitary 43, 97f, 244f 
—, Modal 620 
—, Non-classical 620 
—, Second-Order 7 
—, Weak Second-Order 43 
— with new Quantifiers 43 
Logical 
— Category/Morphism 295, 296 
— Consequence 10, 51 
— Functor 1085 
— Morphism 296 
Lopez—Escobar’s Theorem 
see Interpolation Theorem 
kos’ Theorem 
see Fundamental 
products 
Los-Tarski Theorem 156 
Kos-Vaught Test 66 
Léwenheim-Skolem Theorem 
65, 374 
L.U.B. Principle is False in HYP 954 
Luzin Set 377 
Lyndon Homomorphism Theorem 72 


Theorem of Ultra- 


10, 63, 64, 


MA 

see Martin’s Axiom 
Mahlo Cardinal 397 
Many-one Reducibility 544 
Many-sorted Logic 42, 49 
Markov’s Rule 1036 
Markov’s Schema 990, 102If, 1025 
Martin’s Axiom 318 : 

—, Consistency of 444f 

—, Uses of 491f, 504f 
Matijasevic’s Theorem 557 
Maximal Sets 646 
Measurable Cardinal 
Measurable Set 798 
Measure and Martin’s Axiom 498 
Metrizable Space 518 
Minimal Pairs 661 


344, 400, 806 
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Minimal Model 166, 167, 170 
Model (Structure, L-Structure) 
Model Companion 154f 
Model Complete Theory 54, 130, 144f 
Model Completion 141, 154, 155, 165, 166 
Model Pair 71 
Model of A-calculus 
see A-algebra 
Modules 87 
Modulus of Continuity 890 
Monic 1057 
Monoids 66 
Monomorphism 1057 
Monotone Operator 683, 731, 744 
Morley Categoricity Theorem 82 
Morley Expansion 57 
Morley Rank 83, 84 
Morphism 286 
Mostowski Collapsing Lemma 201, 247, 
309, 465 
Multiply Ordered Theory 88 


17, 50 


Natural Numbers 330, 528 
see also Peano Arithmetic 
Natural Number Object 1086 
Near-standard 203, 216 
neg 826, 836 
Negation Normal Form 249 
Negation (or Double-negation) 
Translation 962, 985 
Negative Formula 985 
—, Almost 988 
Neighbolrhood Function 1008 
Next Admissible Set 245, 777 
N.N.F. 
see Negation Normal Form 
No Counterexample Interpretation 870 
Non-principal Ultrafilter 108 
Non-standard Analysis 231 
Norm 700, 765 
Normal Class of Functionals 697 
Normal Form Theorem 537, 541 
Normal Sequence of Functionals 699 
Normalization Theorem of Curry 1105 
Normed (Class of Relations) 700 
Nowhere Dense Set 496, 497 
Normed Pointclass 730 
Nullstellensatz 125, 146, 147, 153, 167, 171 
Number Selection Lemma 706 
Number Theory in Finite Types 934 
Numeralwise Representability 838 
Numerate 838 


1160 
Object of Higher Type 669f, 692, 707, 740 
Omitted Type 77 
Omitting Types Theorem 78 
One-One Reducibility 544 
One-type 189 
Operation 285 
Operation Symbol, =- 241 
Operative Functional 684 
Oracle 540, 633 
Ordered Pair 329 
Ordered Theory 88 
Ordinal Notations 562, 752, 771, 968 
Ordinals 330f 
see also Admissible Ordinal, Recursive 
Ordinals, Ordinal Notations 
Ostaszewski Space 510 


P=NP Problem 600, 625 
Parameter Theorem 544f 
PA 
see Peano Arithmettc 
Parametrization 763 
Paris—Harrington Theorem 
Partial Elements 1058f 
Partial Function 528 
Partial Isomorphism 97 
Partial Map 1074 
Partially Ordered Sets 407, 412f 
— and Martin’s Axiom 496 
Partition Calculus 390f, 484, 1134 
PD 
see Determinacy 
Peano Arithmetic 79, 330, 840, 851, 860f, 
1134 
Perfect Kernel 85 
Perfect Set 382, 784 
Perfect Subset Property 798 
Perfect Subset Theorem 277 
Permutation Models 361 
Persistence Property 239 
P.G. 
see Point Generator 
Phrase Structure Grammars 582 
PO Sets 
see Partially Ordered Sets 
Point 708 
—, Arity of 716 
—, Character of - 718 
—, Type of 707 
Point-Generator 1003 
Point Projection 710 
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Pointclass 785 
—, Properties of 730 
Pointset of Type k 728 
Polish Space 785 
Polynomial Time 627 
Positive Formula 72 
Post Canonical Systems 755 
Post Correspondence Problem 580 
Post’s Problem 554, 645f, 661 
Post’s Theorem 558 
Post Words 572f 
Power Set 328 
— Axiom 326 
PRA 840, 859 
Predicate Calculus, Intuitionistic 980 
Predicates on Admissible Sets (=, A, , 
A) 240 
Predicativity 949, 968 
Presburger’s Arithmetic 603, 606, 617, 625 
Preservation of Cardinals 422f, 425f, 443f 
Pressing-down Lemma 374 
Pretopos 297 
Prewellordering 700, 796 
— Property 700, 730 
— Theorem 700 
Pre-A-Algebra 1098 
PR Formula 843 
Prime Formulas 23 
Prime Ideal Theorem 356 


Prime Model 80, 145, 165, 166, 167, 170, 
610 

Primitive Recursion 
— Functional 681, 884 
— Test 145 


—, <&, Recursive 884 
Primitive Recursive 534 

— Function 831 

— Formula 843 
Priority Method 554, 640, 645f, 648, 661 
Productive Function 553 
Products of PO Sets 438f 
Program 528 
Progression Rule 879 
Projection 686, 789 
Projective Determinacy 

see Determinacy 
Projective Plane 306 
Projective Set 790 
Projectum 478, 666, 668 
Proof 32 
Proofhood, Decidability of 546f 
Proof-search Procedure 898f, 904f 
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Propositional Logic 23 
Prov, 826, 837f 

Pry 826, 838 

Prt 841 

Pseudo-elementary Class 116 
Pullback 1057 


Quantification Operators 935, 936 
Quantifier 44, 694, 769, 770 

—, Bounded 200 

—, Dual of a 695, 770 

—, Existential 695 

—, Game 769 

—, Monotone 770 

—, Suslin 695 

—, Universal 695 
Quantifier Axioms 30 
Quasi-disjoint 

see Delta System 


RA 
see Recursive Analysis 
Rado Selection Lemma 357 
Ramified Analysis 948 
Ramified Analytic Sets 948 
Ramified Progressions of Theories 
968 
Ramsey Theorem 
—, Finite 392 
—, Extension of the Finite 393 
R.E. 
see Recursively Enumerable 
Real Number Generator 993 
Realizability 987f, 1018 
Realize a Type 77, 266 
Realized Type 77 
Real Numbers, Non-axiomatizability of 11 
Real Statements 823 
Recursion in Set Theory 333 
Recursion (2) 243 
Recursion on Ordinals 
see a-Recursion 
Recursion Operators 933, 936 
Recursion Theorem 545, 690, 705, 716 
Recursive Analysis 
—, Classical 975, 991 
—, Constructive 974, 990 
Recursive Analogues of Transfinite Number 
Classes 772 


949, 


183, 392 
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Recursive Function 529, 531, 560f, 632, 
634, 649, 740, 754 
—, General 534 
—, Primitive 534 

Recursive Functional 540f 
— of Finite Type 709, 952 

Recursive Language 50 

Recursive Operator 563f 

Recursive Ordinals 559, 561f 
see also Church-Kleene w, 

Recursive Real Numbers 562f 

Recursive Relation 305, 532 

Recursive Set 532, 800, 842 

Recursively Axiomatizable Theory 50, 
547f, 552 

Recursively Enumerable 542f, 552f, 570, 
750, 800, 841 

Recursively Saturated Model 69 

Redex 1102 : 

Reduced Product 109 


Reducibility 
—, Many-One 544 
—, One-One 544 


—, Turing 550 
see also Relative Recursiveness and 
Degrees 
Reduct of a Structure 30, 115 
Reduction 276, 795 
Reflection Principle (2) 240 
Reflection Principle (Proof Theoretic) 892 
—, Local 845f 
—, Uniform 845f 
RFN‘(T) 845 
RFN;,(T) 846 
RFN;,(T) 849 
RFNi(T) 849 
Rfn(T) 845 
Rfny(T) 846 
Rfns,(T) 849 
Rfny,(T) 849 
Reflexiveness Theorem 851 
Reflexive Theory 851 
Regular Cardinal 372 
Regular Category 295 
Regular Set 665 
Regularity Axiom 326 
Relative Admissibility 242 
Relative Consistency 404 
Relative Consistency Theorem 859 
Relative Recursiveness 540, 549f, 633f 
see also Relativization 
Relatively Algebraically Closed 144 
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Relativization 638, 639, 648f 
see also Relative Recursiveness, 
Recursively Enumerable, Roger’s Conjec- 
ture, Reducibility 
Relativization of Notions to Universes 
Remarkable E.-M. Set 194 
Replacement (2, 2,) 241, 658 
Representability 751, 753f 
Representing Function 831 
Resolvable Admissible Set 270 
Restricted Induction 931, 967 
Restriction of a Functional 691 
RFN 
see Reflection Principle (Proof 
Theoretic) 
Rice’s Theorem 545f 
Rings 
-——, Division 110, 124 
—, Local 296 
—, Prime 123 
—, Primitive 117, 124 
—, Principal Ideal 14 
— of Polynomials 54, 97 
—, Regular 172f 
—, Regular f 173f 
Robinson Consistency Theorem 71 
see also Interpolation Theorem 
Robinson’s Test 144f 
Roger's Conjecture 638 
Rolle’s Theorem 1046 
Rosser’s Theorem 841 
—, Formalized 854 
Rules for First-Order Logic 
Hilbert-style 31 
Gentzen-style 37, 872 
Rules for Lu. 99 
Rule Set 741 
—, Finitary 742 
-—, Deterministic 744 
—, Recursive (r.e.) 751 
—, Regular Arithmetical 752 
—, Regular Elementary 757 


924 


S-Space 516 

Satisfaction Relation 20f 

Satisfiable 50, 69 

Saturated Model 
—, >- 266 
—, Recursively 69 

Scale 810 

Schemata in Finite Type Theories 
—, Comprehension 933, 938 


76, 128, 134, 147 


932 
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—, Choice 933, 935, 938 
—, Second-Order 937 
Second-Order Arithmetic 
see also Analysis 
Second-Order Arithmetic 
(Intuitionistic) 983 


168, 639f 


Second Product Theorem 440 


Section 729 

1-Section of Relative Primitive 
Recursive Functionals 961 

Self-reference 1162 

SemComp, 854 

Semi-decidable Relation 541 


Semigroups 576 


Semi-recursive Relation 542f, 693 
see also Recursively Enumerable 


Semi-recursive Set 
Semi-Thue Processes 
Sentence, First-Order 20 
Separation Axiom 325 

Separation Schema 321f 


542f, 693, 800 
S71f, S581f 


—, Ar 239 

—, A- 241 
Seq 834 
Sequent 37 
Sequentially Compact Space 509 
Set Mappings 485 
Set Theory, Axioms for 15, 321f 
Set Theory, Consistency of 654 
Sheaf 172f, 298 
Signature of a Functional 683 
Simple Model 122 
Simple Set 553 
Simplified Form of a Point 716 


Simultaneous Induction Lemma _ 687 


Site 298 

Skolem Expansion 57, 164, 
Skolem Function 57, 185 
Skolem Paradox 406 
Skolem Relation 56 
Skolemization 164 

Skolem Term 185 


185 


S-m-n Theorem 544, 705, 715 


Solvable Term of A-Calculus 
Soundness Lemma 35 
Sound Theory 827, 844 
Souslin Line 

see Suslin Line 
Space, Metric (Intuitionistic) 
Spector Class 767, 777 
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1002, 1003 


Spector-Gandy Theorem 276 
—, Abstract Version of 768 
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Speed-up Theorem 539 
Splitting Theorem 663f 
Square (CJ) Principle of Jensen 479 
—, Consistency of 434f 
Stable Ordinal 246 
Stable Theory 87, 164, 173, 192 
Stage Comparison Partial Function 696 
Stage Comparison Theorem 697, 766 
Stages of an Inductive Definition 758 
Stages (for Constructing Sets) 323 
Standard Codes 478 
Standard Part 203, 215 
Standard Representation 
State 530 
Stationary Set of Ordinals 
Steinitz Theorems 147, 149 
Stone Space of a Language 59 
Stone Space over a Model 8&4 
Strategy 643 
Strong Relation Universality 267 
Structure 17 
see also Model 
Structure Induced by an Interpretation 613 
Sturm’s Theorem 148 
St(x) 203 
Sub 826, 836f 
Subformula 95 
— Property 876 
Subobject Classifier 1068 
Substructure (Subsystem) 52, 184 
Subsystem 
see Substructure 
Suitable Class 685 
Superjump 735 
Superstable Theory 87 
Superstructure 200 
Support 237, 361 
Suslin 
— Hypothesis 
— Line 378 
— Property 
see CC Spaces 
— Quantifier 695 
— Separation Theorem 272, 559, 792 
— Set 811 
— Tree 385, 472, 473 
— and Martin’s Axiom 499 
Symbol 7 
Symmetric Extension 
SynComp, 854 
Systems of Notations for Ordinals 
see Ordinal Notations 
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372f, 407 


472, 484 


360, 362 
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TC 
see Transitive Closure 
T-predicate 537, 541 
Tarski’s Decidability Theorem for the 
Reals 596, 606, 609, 624 
Tarski’s Undefinability Theorem 558 
see also Truth in Arithmetic 
Tautology 
of Propositional Logic 25 
of First-Order Logic 28 
Taylor's Small Oh Formula 216 
Term 
—, Closed 18 
—, of Forcing Language 413 
—, Skolem = 185 
Theory 50 
—, Complete 16 
—, Consistent 16 
—, of a Class K 50 
—, Decidable 16 
Thue Processes 575 
Topology, Unsolvable Problems in 579f 
Topos 298, 1054, 1068 
Total Function 528 
Totally Transcendental 84 
Transfinite Induction Scheme 942 
see also Induction 
*.Transform 199, 200f 
Transition Function 531 
Transitive Closure 242 
Transitive Collapse 
see Mostowski Collapsing Lemma 
Transitive Set Object 309 
Transitive Sets and Models 
Transitivity Theorem 559 
Tree 38I1f, 407, 472 
see also Aronszajn Tree, Suslin Tree 
Tree Decidability Theorem 616 
Trm 860f 
Tr, 844, 850 
Truncation Lemma 248 
Truth 
— Assignment (for Propositicaal 
Logic) 24 
— (in Arithmetic) 558, 560, 639, 647 
— Definition 843 
— Lemma 415 
— Set of Arithmetic 639, 647 
— Table 24 
Tugué Object 692 
Turing Machine 530f, 569f, 587, 590f 
—, Universal 536 


405f 
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Turing Reducibility 550, 633 

see also Relativization 
Two-cardinal Models 68 
Two-cardinal Theorems 68, 487 
Type 76, 127 

—, 2 Family of 266 

— (Definable) 1064 

— of a Point 707 

— Symbols 929 


Ulam Matrix 376 
Ultrafilter 107 

—, Good 134 

—, Principal 108 
Ultrapower 109 
Uitraproduct 109f 

— of Rings 110 
Uncountability Quantifier 44 
Undecibable Theory 16, 548, 597 
Unfruitful Limitations on Mathematics 330 
Uniformization Theorem 368, 803 
Uniformly Continuous 206 
Uniformly Differentiable 206, 208 
Union Axiom 326 
Uniqueness Theorem for Prime Models 80 
Uniqueness Theorem for Basic Saturated 

Models 75, 76 
Universal 

— Formula 52 

— Function 702 

— RE. Relation 543 

— Set 788 

— System 705, 715 

— Theory 53 

~ Turing Machine 536 
Universes of Sets and Functions 918 

—, Cartesian Closed 921 

—, N-closed 921 

—, 3“closed 921 
Unsecured Sequences, Induction over 1009 
Unsolvable Problems 534, 538, 545, 548 
Unstable Theory 87 
Upward Léwenheim-Skolem Theorem 65 
Urelement 237, 322 


VeL 

see Constructibility, Axiom of 
Valid 29 
Validity (Intuitionistic) 1024f 


Value Category 286 
Vaught Covering Theorem 271 
Vaught Formula 254 
Vaught Reduction Theorem 276 
Vicious Circle Principle 

see Circular Reasoning 


W-Principle of Silver 514 

—, Consistency of 432f 
Weak a-recursiveness 660 
Weak Counterexample 996 
Weak Robinson Forcing 90 
Weakly Compact Cardinals 397, 483 
Weierstrasz’ Approximation 

Theorem 1045 
Well-founded Relation 305, 743 

—, Length of 792 

— of Orderings 941 

— Part 247 

— Structure 246 

— Tree 747 
Well Ordered Model 193 
Well-orderings 96 

—, Comparability of 945, 954 

—, Provably Recursive 945 

— of Z 946, 958 
— of Ramified Analysis 959 

—, Natural 878 
Well-ordering Theorem 347 
Western Model Theory 48 
Witnessing Constants 30 
Whitehead’s Problem 494 
Witnessing Expansion 29 
Word Problem 169, 171 

— for Semigroups 577 

— for Groups 578 


Z-group 151 
Zariski Topos 302, 307 
Zermelo-Fraenkel Axioms for Set 
Theory 324f, 404, 840, 851 
ZF 
see Zermelo—Fraenkel Axioms for Set 
Theory 
ZFC 
see Zermelo-Fraenkel Axioms for Set 
Theory 
ZO Set Theory 309 
Zorn’s Lemma 338, 355 
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a-Calculability 659 TI, Formula 843 
a-Finite 657f =-Compactness Theorem for Admissible 
see also A-finite Fragments 251, 776 
a-Recursion 653-679, 885 =, Formula 843 
a-Recursive 656-658 g 838 
a-Recursive in 660 w-Compr 854 
a-Recursively Enumerable 656-658 w-Complete Theory 78 
€-Model 64 w-Completeness Theorem 78 
No-Set 128 w-Consistency 825, 85If 
nrSet 129 —, Local 853 
A-Algebra 1099 —, Uniform 853 
—, Weakly Extensional 1099 —, Global 853 
—, Extensional 1099 «w-Homogeneous Model 70 
—, Interior of a 1100 w-Logic 42 
—, Hard 1100 w-Parametrization (of a Class of 
—, Sensible 1119 Functions) 702 
A-Calculus 1092f w-Parameterized Pointclass 730 
A-Definable Functions 1103 w-Pseudo Complete 133 
A 828 w-Rule 78, 752, 878 
a-Complete Space 505 ,-Incomplete Ultrafilter 129 
71,72 830, 834 w,-Saturated 128 


II} Relations 752 -Set 1070 
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